Due to COVID-19 pandemic, OpenStack PTG for Victoria design discussions held virtually. This was the first-ever virtual event for OpenStack. Overall this was a successful event and nicely organized by OSF. You can find the overall high-level recap of PTG here. This blog covers the QA, TC and nova-oslo cross-project sessions I attended.
All Projects Etherpad for details: http://ptg.openstack.org/etherpads.html
We discussed things we finished like py2 drop, adding the ideas repository and define the process for dropping projects from OpenStack governance. Next things we discussed on:
- PTL-less projects which took a lot of time from TC to get an appointment.
- To make TC office hours more effective. One way is to change back to regular meetings. Naser will be proposing the vote for that.
- How to connect projects and TC more closely. We decided to continue with liaison things and regular check with PTL about project health.
Make TC members on-boarding more smooth
When new TC members are elected, we have some on-boarding tasks for them for example, how actively they need to participate in TC activities, what all to review, meeting and other event participants etc. We do not have those steps in a clear document way. When I was selected as TC, Doug(TC chair) sent an email to us for help in those on-boarding things.
We decided to document those steps so that it will be helpful for each new member to join TC.
- Document how to propose changes for review (tags to use, etc)
- Document the on-boarding information like they have for PTLs, etc. Collect up the tribal knowledge. – diablo_rojo
This was ongoing things to fix. User Committee lack of potential candidates nomination in the past couple of elections. In Technical Committee, we decided to merge both the committee into a single governance team. ttx will be coordinating it with UC and propose the merging structure and proposal.
ML thread: http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014736.html
- ttx to sync with the UC to line things up
Help Wanted List
We will be continuing this list and revise it every year. TC has given this list to Board of Directors also to get help from the organization side. We have many areas for 2020 help needed list: https://governance.openstack.org/tc/reference/upstream-investment-opportunities/2020/index.html
TC position on OSF Foundation member community contributions
OpenStack is facing the issue of less contribution for many years. Many projects are facing this issue and TC presented it to Board of Directors also in Berlin summit and worked on a few things after their feedback. But there is no outcome or help on contribution issues.
In this PTG, TC discussed next steps and propose some ideas to BoD like
- Enforce or remove the minimum contribution level
- Give gold members the chance to have increased visibility (perhaps giving them some of the platinum member advantages) if they supplement their monetary contributions with contributor contributions.
There was no conclusive item or way on this topic yet and Naser (TC chair) who is BoD also will bring this to BoD meeting.
- Mohammed will take it forward to the board and see what, if any, feedback we get. Will try to summarize the discussion as much as possible.
OpenStack user-facing APIs in the wild (i.e. OpenStackClient)
This was long pending things to do from OpenStack usage point of view. OepnStackClient is a standard and consistent client for OpenStack APIs but it is not finished in many projects. For example, glance is one project does not want to migrate to OSC due to resource issues.
Artem tried to propose this as a community goal in the Train cycle but it was not selected due to a few resistance. Everyone agreed to continue this effort and try to
finish this soon. As the next step, we will be forming a pop-up team in the Victoria cycle and try to finish more projects before we declare this a community-wide goal. Below are the next steps on this topic:
Step 1. Make pop Up.
Step 2. Pop upstarts conversation about how this is important and digs up issues. Step 3. Document process (maybe using nova as an example)
Monitoring in OpenStack: Ceilometer + Telemetry + Gnocchi state
Gnocchi is in a separate repo and not well maintained. Ceilometer is dependent in some ways on Gnocchi. This has been discussed in past many times. One proposal was to move away from Ceilometer and remove dependence on Gnocchi. There is no conclusion on this topic yet and other ongoing ideas are:
- Ceilosca? Merging Ceilometer and monasca teams ?
- Use oslo.metrics to become interface above all tools we have
Pop-up teams retrospective: We have two active pop-up team currently. Policy and image encryption. Policy team has not made much fast progress yet. I have finished the Nova policy work now and will be able to concentrate more on other projects. Image encryption progress is also slow progress.
- Reducing community goals per cycle: There is not hard written rule of having two community wide goals per cycle but we have been selecting always two as minimum goal in each cycle. TC discussed and decided to have it document about no minimum number of goal or even to skip the goal for any cycle.
Action Item: Clear outline of the documentation for goal proposal and selection. Documentation that we don’t have to have 2 or 3 goals – gmann
- Victoria goal: We selected the one community wide goal of migration of legacy jobs to zuulv3 native. We discussed to have “migrating the CI/CD to ubuntu focal”.
- W cycle goal discussion kick-off: As per new goal schedule, we need to start the W cycle goal in V cycle so that we can finalize the list of community goal when W cycle development start.
Action Items: TC Needs to have one or two people to drive W goal selection based on the timeline in the governance repo – njohnston & mugsie
Detecting unmaintained projects early
Few projects become unmaintained in between or start of the cycle itself but we notice those during PTL election when there are no PTL candidates for those projects. Congress and Tricircle are the two recent examples. TC discussed re-enabling the health-check for projects but liaison things need to do the same activity. It is best practice for TC liaison to do twice checks per cycle. Also release team and QA teams can also trigger the in active projects/repo. TC will continue assigning the project liaison for Victoria cycle too.
PTL role in today’s OpenStack / Leaderless projects
Every cycle, there are many projects with no PTL candidates and in Victoria cycle numbers were large. Technical Committee discussed new options of replacing the PTL role to decentralized responsibility with liaison for different activities. For example, release liaison, infra liaison, TC liaison etc.
TC has a mixed vote on the new proposal, many TC things that are just renaming the PTL not actually changing anything. In the current model also, PTL can delegate its duty to anyone.
The second option was to allow projects to try multiple maintainer role and others can keep continuing with PTL model.
Summary: Doesn’t appear that anyone is opposed to allowing teams to experiment with having multiple maintainers rather than a PTL.
Needs to be documented. Perhaps in the reference.yml file.
- Resolution for how we want to handle optionally splitting PTL role (summarize discussion)- njohnston & evrardjp
Reducing systems and friction to drive change
OpenStack had a lot of processes as per the activities and number of contributors in past. But now OpenStack reduced in a number of contributors as well as activities. To make things fast we need to reduce or lose some process around various activities.
TC discussed what all problems we face and list them and solve one by one. Single maintainer team like requirements are one of the key things to solve, there were discussions about defragmenting the OpenStack means merging the relevant team into one for example requirement in Oslo. Below is the list of problems in this area:
- TC separate from UC (solution in progress)
- Stable releases being approved by a separate team (solution in progress)
- Making repository creation faster (especially for established project teams)
- Create a process blueprint for project team mergers
- Requirements Team being a one-person hero 🙂
- Stable Team
- Consolidate the agent experience
- Figure out how to improve project <–> openstack client/sdk interaction.
Discuss tag “tc:approved-release”, should we deprecate/remove it?
This came up from Manila PTG. Goutam pinged me on TC channel about adopting the TC tags in Manila and while checking for this tag we found that this tag was introduced in the old model of OpenStack when we had an incubated vs integrated project concept. This tag was a reference for BoD and interop team to know how mature the project is and they do follow the release model or not so that they can think of that project to be included in the interop certification program.
We concluded this to remove and notify BoD/Interop group to refer the project.yaml for OpenStack released projects.
- Proposed the removal of the ‘tc:approved-release’ tag and indicate that projects in OpenStack Repos are TC approved -gmann
OpenStack 2.0: Kubernetes-native
This is a new tag idea from Zane – https://review.opendev.org/#/c/736369/
“A common starting point for an OpenStack cloud that can be used to deploy
Kubernetes clusters on virtual machines in multiple tenants, and provides all
of the services that Kubernetes expects from a cloud.” This was not much discussed as this was during the end of the PTG but the idea was to kick off the discussion and start collecting the feedback on the review.
We discussed good things we did in Ussuri cycle and what all to improve. Bug triage was one the key thing and had good progress and credit goes to kopecmartin. Also few new Cores in QA. yoctozepto in devstack and kopecmartin in Tempest.
During py2 drop we faced a lot of issues on maintaining the stable branch testing stable. We were able to fix those issues on time and keep gate healthy.
Few things we need more improvement is on keystone system scope testing and Patrole maintenance. We will keep doing those on priority in Victoria cycle.
- Open bug discussions need to be done in PTG
- keep bug triage
- QA Office hour Time (if we have the time to discuss)
- AGREED: move to 13:00UTC
Make tempest scenario manager a stable interface
We need to find common manager methods among plugins and define them in Tempest. Plugins should reuse the code from Tempest then and drop any duplicate methods from their repositories.
We discussed a few ideas:
- Audit around all the tempest plugins and add all repeated methods within scenario manager
- Find the methods which are really actually *used* by the plugins
- Audit if any plugins still using the Tempest scenario manager?
- Audit around all the tempest plugins and make methods consistent with parameters
- Clean the existing methods, since a single class is populated with lot of methods
- Break method to be single scope as much as possible.
- Adding more detailed docstrings, it should help us to understand the code/method/class
- Gmann, kopecmartin to create all the audit tasks on etherpad
- Sonia list all plugins using the scenario manager copy under ‘Audit’ section.
Gates optimization by a better test scheduling
Tempest now has the –worker-file parameter passed to stestr so that we can schedule the test execution over different workers in a balanced way. The idea here is to try distributing the test execution on gate jobs but at the same time we need to think about parallel execution of scenario tests which was made in serial due to ssh issue. We need to try this on tempest-parallel job first and see how it works.
- arxcruz to add this in tempest-full-parallel
- arxcruz after that make tempest-full-parallel voting and rename it to tempest-next
Over the last few months, a tempest-cleanup ansible role has been developed which will help us to use (test) the cleanup tool in the gates. Also lots of improvements have been made which optimized the tool and made it more efficient, new services were added to the scope of the cleanup and of course some bugfixes were made as well. We discussed a few of the improvements as next steps:
Verify if all leaked resources are cleaned up properly or not. For example, verify the dry run data.
Make a flag where cleanup can start failing the job if any leaked resource so that we can fix the tests causing a resource leak. If not failing then somewhere to capture the data.
- kopecmartin to send ML about this cleanup improvement we finished and usage.+ add plugins extension also in ML
- kopecmartin to do L139 and L141 (Idea 1. and 2.)
- gmann and masayukig can help plugins extension.
Feature Freeze idea for new tests in Tempest and Patrole and other QA projects
We want to do Feature freeze for a few of the key repos under QA like devstack, Tempest, Grenade and Patrole. We agreed to call Feature freeze on R-3 week of that cycle release. Feature freeze in those projects means we will be accepting any new test cases or enhancement at the end of cycle to avoid regression during cycle release.
- gmann to add the doc for this.
Cinder backends specific features testing in case of multi backends
There is a feature flag option for backend-specific feature tests but for multi backend case it is different and feature flag wound not work. In multinode, you can have some backend have that feature implemented and some not so we cannot say the feature is not present in that env.
- let’s add the config option for backend per feature, like encryption_enable_backend
- If there are too many such config options then we need some other way to test the both cases.
- Add new test class to cover the backend hint in volume_type.
How to handle the tox.ini constraint for each Tempest new tag
Tempest is branchless and releases the tags per cycle but tox.ini has hard-coded default constraint to as master constraint. So older Tempest tag’s tox.ini (with the master constraint to use) might not compatible. We need to pin constraint in tox.ini when we release the new tag.
- Document the process for when to merge the tox updates and what all things to do, for example, devstack changes etc.
Migrate hacking checks from diff. projects to hacking itself
We want to make sure we add/cover pep8/ flake8 most common and important checks to hacking itself: https://etherpad.opendev.org/p/hacking
- Comment on the etherpad and discuss them in the office hour
Description for testcases as docstrings
This is ongoing work to add the test docstring: https://blueprints.launchpad.net/openstack/?searchtext=testcase-description
For further steps, we discussed to publish those docs with the auto-generation option in tempest doc.
Victoria Priority & Planning
We discussed the Victoria priority and here is the list: https://etherpad.opendev.org/p/qa-victoria-priority
We discussed what went well and what to improve next. There are few things that went well in Ussuri cycle:
- After RC1 things went fairly smooth.
- We recruited two new cores (+ new stable cores \o/)
- Policy work done finally, whoop! +1 (this was a lot of code)
- nova-network is dead (along with nova-console, nova-consoleauth, nova-dhcpbridge, …)
- We went Python 3-only at last.
On improvement side, more reviews are needed to get fast merge in nova. And still Cores are not as per the projects ongoing work.
[nova][oslo]: Policy migration to handle scopes and roles.:
There was one bug reported when users generate the JSON format policy file from Oslo tool where the policy file has all new defaults rules without the deprecated values. -https://bugs.launchpad.net/nova/+bug/1875418
We discussed possible issues the operator can face while adopting the new policy. Migrating to a new policy should be smooth and should have a consistent way to use the policy file. Policy file in JSON format is an issue where you cannot comment on the default rules. We should provide a better way to use the policy file.
Policy file should only have override rules not the one with default values. Now we have a policy in code so any rule not present in policy file will be taken from default in code. If no rules to override then it is fine not to pass the policy file itself. Below are the next steps and working item we need to finish in Victoria cycle:
- Warning if default rules in file
- Upgrade check will be a good place too
- Deprecation on the tool for JSON generation and then remove it in the next cycle
- Warning on policy file being passed as JSON to Oslo
- Change config policy_name default value from policy.json -> policy.yaml
- convert the JSON file to YAML with default rules as commented out but keeping the overridden rule uncommented.
As a summary, the first step will be to deprecate the JSON format and migration steps to YAML format. Also, we need to add upgrade checks for the policy file format change and documents it clearly on project side documentation.
[nova][oslo]: Add a /healthcheck URL
This is to add healthcheck endpoint for nova to check if service running or ready to use for example load balancer can call this endpoint to know the status. There are multiple things to cover to say if service is healthy or not like DB, MQ, services and Cell are able to communicate or not. We cannot cover or report the complete health of node which are more scoped on client side instead of nova.
Details for flow and example response are in https://etherpad.opendev.org/p/nova-healthchecks
We discussed and agreed on below things to do as part of healthchecks:
- Do the auth data collection via cache what Dan mentioned in the review (731396) and healthchecks will return that cache info to unauth users without talking to any DB/MQ(basically no processing of data)
- API worker checks is ok to do and system level checks can be on client side or external to nova as that include the all node scan etc
- It will be implemented as Nova healthchecks plugin and what all things to be added in this will be discussed in next sessions I nova PTG.