When do companies begin to acknowledge “DevOps” as a culture?
You may go on LinkedIn and see positions open for DevOps Engineers. While the shoe may fit, putting the responsibility of upholding a DevOps based culture falls within the entirety of an engineering department: QE Engineers, software developers, and security teams included. Automated testing for CICD will require heavy lifting from the Quality Engineers; package scanning / license scanning / SAST / DAST scanning may require the advice of security teams. Reminder: the goal of DevOps is to EMPOWER software developers. This is done with collaboration.
Your company’s adoption rank differently based on DevOps concepts. For example, deployments may be happening more frequently than waterfall, but you are not yet in the cloud.
Companies in stage 1 are usually either affiliated with the government or are older companies. New companies typically start in later stages due to not having to uproot existing tech debt. In stage 1, companies are:
- Waterfall Deployments: large deployments with many tickets are happening every few months or so. The mean time to resolve bugs takes longer and have a higher chance of breaking
- On-Premise Architecture: cloud-hosted architecture allows companies to scale at their own pace without having to consult providers, product availability, and time for installation. Machines can be run instantly. Government or government-affiliated companies may be required by compliance to host servers on premise or in a data center owned by them
- Monolithic repositories — Code bases are MASSIVE. This increases the size of the service itself, increases the time necessary to build, and also consumes resources unnecessarily
- Security — It is but a distant dream…
- Monitoring and alerting — No alerts are set; alerts are normally reported by a customer, which will typically queue up a hot fix cycle to be completed
The big push was going from on-premise or data center hosted infrastructure to cloud-hosted. This involved taking inventory of all our infrastructure and documenting the ever-living shit out of it. This included firewalls, programs, users, groups, files, and the works. From there, we began building from the developer environment. Developers were slowly integrated into using the cloud platform. Created QA, staging, and UAT. Servers that were not client-facing in production were moved over. We captured images of the servers and do installations on EC2 instances
The major move was moving into a Cloud Platform to increase scalability. We incorporated QE Engineers to test and confirm each service was operational.
In stage 2, companies are beginning to understand and lobby for DevOps principles.
- Cloud Computing — It may not be impossible to implement DevOps principles without the cloud, but it will damn near be difficult. Migrating into the cloud will speed up how quickly your company is able to get more resources while also empowering the DevOps team with a multitude of services within a cloud platform. In stage 2, your department may be creating lower environments in the cloud for developers to test in. Then, a production environment may be created in tandem with what exists. Shifting low stake, low impact services to the cloud first in production (ex: internal facing services)
- Deployments are manual, but happen more frequently — There is an extra role and responsibility dedicated to release management. Deployments are maintained by this individual.
- Alerting is synthetic — Basic uptime and downtime is tracked with tools like Pingdom. Status pages can communicate to customers when downtime is happening.
- New services are being created as microservices — As engineering departments acknowledge the benefits of creating microservices, there is a shift from adding to the monolithic repository to creating new microservices. At this stage, there is planning to break up the monolithic service, but it hasn’t quite happened.
- Taking inventory of security — Security has an inventory of infrastructure and has begun assessing what is needed to be secured. A backlog is being created based on needs.
The transition from stage 2 to stage 3 involved a department wide initiative to adopt DevOps as a culture, rather than a silo-ed team. On the development side, our team began to acknowledge monolithic architecture and begin to create microservices for more horizontal scalability and a more rapid pace
Some minor pushes:
- Creating synthetic alerts
- Documenting and taking inventory of available tests
- Cloud services — There may be one or two services still on premise. For the most part all services are in the cloud.
- Deployments — By stage 3, deployments may be happening once a week. There is a shift from having a need for someone managing releases to making releases self-service: to be maintained by development teams. Empowering developers to maintain releases allows there to be a shift towards CICD.
- Transition to microservices — Engineering teams are breaking down the monolithic repository. Engineers are working with a dedicated software architecture to create microservices out of the large one.
- Inventory of manual tasks — DevOps teams are creating a backlog of tickets of tasks being done manually and assigning priority based on effort to impact. From a managerial perspective, companies are beginning to give go through a backlog refinement process to have conversations around priority.
- Manual testing — QE teams are conducting manual tests and documenting. This is laborious and time consuming. QE Engineers may be interested in moving into development roles, which paves the way for potential SDETs.
Our team started implementing CI/CD — Creating pipelines and educating developers on maintaining pipelines
Our DevOps only initiative was observability: Creating alerting and monitoring around services.
- Start to develop an application onboarding process
- Utilizing security resources available. For example, package scanning in repositories for vulnerabilities
- Containerization — DevOps teams are shifting to containerization and integrating Kubernetes to perform container orchestration. Using containers allows for horizontal scaling to be automated, especially taking advantage of deployments and horizontal pod autoscalers within Kubernetes. Containers are intended to be lightweight and perform one task only, making them quick and easy to build.
- Microservice architecture — Services are lightweight. Companies may have begun to integrate an artifact repository such as JFrog in order to reference current code bases.
- Site Reliability Principles — Based on the Google Handbook, SLAs, SLOs, and SLIs are being discussed with Site reliability engineers and development teams. SREs are developing runbooks or playbooks, have a well established incident response plan, and have run playbooks alongside development teams.
- Consistent releases occurring — Release cycles have become self-service, allowing development teams to manage their own releases and potential hot fixes without the assistance of DevOps engineers or a release manager. A QE engineer on each team is in communication to ensure releases pass basic checks.
- Observability — DevOps teams have created dashboards for monitoring microservices and doing log management. They have created a standard tagging convention for metrics to be easily pulled from a platform. Creating the initial dashboards for container monitoring and logs and establishing a tagging convention paves the way for application onboarding processes.
- Shift left with security — Process auditing, container image auditing and scanning, SAST / DAST scanning is occurring within deployment pipelines. There’s still a slew of package vulnerabilities to be fixed.
I’m not sure yet…
I would imagine this would include incorporating a service catalog into the mix. Application onboarding would be standardized to have a self-service deployment pipeline, alerting, monitoring, logging, SLIs, SLOs, and SLAs.
Vacationing on a beach with a margarita in hand.
Some alerts to let you know you’re alive and employed.
I have not worked for a company employing stage 5 DevOps practices, but I am actively working on an advanced team. I imagine Stage 5 DevOps teams can adequately discern their needs and pick through the new emergence of products to find what works. These teams have fully implemented continuous integration and continuous deployment.
Unit tests, load tests, package scanning, license scanning, and SAST and DAST have been introduced into deployments. This will involve the work of quality engineers, security teams, and software developers. Release cadence to productions can happen quickly.
Site reliability engineers and DevOps engineers have joined forces to implement regular chaos engineering (buzzword for disaster recovery) exercises. I would also imagine teams would have developed “purple teams”, which is a collaborative effort between development, DevOps engineers, and security teams to shift security practices earlier.
It will likely not be possible to automate EVERYTHING due to breaking changes being released, but one can dream.