背景与动机
在当今快速发展的技术环境中,软件开发已成为创新和商业成功的关键驱动力。随着像 Codex 这样的 AI 技术以及 AI 辅助的配对编程工具的出现,有机会彻底改变软件开发的方式。本文概述了一种战略方法,通过采用大型语言模型的能力,如 Codex、总结及聊天机器人工具,来提升软件开发生命周期(SDLC),从而提高生产力、效率、代码质量和上市时间,同时遵守自身的 SDLC 标准和编码模式。
利用这些工具的一个关键目标是使开发人员、测试人员、SRE、产品经理及敏捷教练等关键资源能够专注于高层次的思考任务,如设计、审查、架构、业务用例等方面,而将低层次的任务,如文档编写、测试数据创建、编写测试用例、生成状态报告等,交给 AI 工具处理。
Approach 方法
以下是 SDLC 各个阶段及其一些关键任务的示意图,并将 AI 驱动的 SDLC 与非 AI 驱动的 SDLC 进行了比较。AI 驱动的 SDLC 包括利用 Codex 工具和 LLM 生成/建议代码、创建单元测试用例、创建 SQL 语句、代码分析、代码翻译、生成发布说明、状态报告、代码文档等。

Figure 1: Comparison of AI Powered SDLC vs Non AI SDLC 图 1:AI 驱动的 SDLC 与非 AI 驱动的 SDLC 比较
Below are few key areas how AI Codex and LLM tools can be helpful in SDLC.
以下是 AI Codex 和 LLM 工具在 SDLC 中可以提供帮助的一些关键领域。
- Summarization Tools: AI can be used to summarize code changes, document code, create release notes, readme documents, create test summary reports, draft product guides, making it easier for product owners & developers to develop software more rapidly.
总结工具:AI 可以用于总结代码更改、文档化代码、创建发布说明、README 文档、创建测试总结报告、起草产品指南,使产品所有者和开发人员能够更快地开发软件。 - Code Recommendations: AI can suggest relevant code snippets, design patterns, and best practices while developers write code. Advanced custom trained codex models can generate good quality code while adhering to your organization’s standards and software framework.
代码推荐:AI 可以在开发者编写代码时建议相关的代码片段、设计模式和最佳实践。高级自定义训练的代码模型可以在遵守您组织的标准和软件框架的同时生成高质量的代码。 - Chatbots — Interactive Q&A/Knowledge Search: AI-driven chatbots can assist various teams in finding solutions, answering queries, and sharing knowledge.
聊天机器人——交互式问答/知识搜索:基于 AI 的聊天机器人可以协助各个团队查找解决方案、回答问题和分享知识。 - Standards Checkers: AI can perform code analysis to ensure adherence to coding standards and identify potential issues.
标准检查器:AI 可以进行代码分析以确保遵守编码标准并识别潜在问题。 - Testing Tools: Utilize AI for generating test data, suggesting SQL queries, and even automating the creation of test cases.
测试工具:利用 AI 生成测试数据、建议 SQL 查询,甚至自动化测试用例的创建。
Below table illustrates few SDLC tasks and sample Codex prompts for those tasks.
以下表格展示了部分 SDLC 任务以及这些任务的示例 Codex 提示。

Table 1: Sample SDLC tasks and sample Codex prompts 表 1:示例 SDLC 任务和示例 Codex 提示
AI Codex Tooling AI 编码工具
To achieve above, we need a set of AI tools custom trained on quality datasets with specific focus areas such as Codex trained on quality code, chatbot trained on knowledge banks ( code + confluence pages + change requests, run books etc). And we need to integrate them with SDLC tool chain such as Jira, GitHub, IDE’s, CI/CD tools, Release management & SRE tools.
为了实现上述目标,我们需要一套针对高质量数据集进行定制训练的 AI 工具,重点关注领域如 Codex 训练在高质量代码上,聊天机器人训练在知识库(代码+Confluence 页面+变更请求、运行手册等)上。并且需要将它们与 SDLC 工具链(如 Jira、GitHub、IDE、CI/CD 工具、发布管理及 SRE 工具)集成。
Strategy & Execution Plan策略与执行计划
Transitioning to AI Powered SDLC needs a detailed strategy and thoughtful execution plan. Below figure depicts how to approach this.
向 AI 驱动的 SDLC 过渡需要详细的策略和周到的执行计划。以下图表描述了如何进行这一过程。

The key aspects are:关键方面是:
1) Defining the AI SDLC Vision and Evaluation approach
- 定义 AI 驱动的软件开发生命周期愿景和评估方法
Vision 愿景
· Improve productivity.· 提高生产力。
· Leverage talent for higher order tasks
· 充分利用人才进行高层次任务
· Automate lower order tasks.
· 自动化低层次任务。
· Bring standardization.· 带来标准化。
Evaluation Strategy 评估策略
Identify two cohorts — one using AI tools and other using non-AI tools. Measure improvements between two cohorts for various SDLC tasks and overall improvement(s) by comparing key metric such as:
识别两个群体——一个使用 AI 工具,另一个使用非 AI 工具。通过比较关键指标,如:衡量两个群体在各种 SDLC 任务中的改进以及总体改进情况。
· Num of story points delivered per sprint.
· 每个迭代交付的故事点数量。
· Idea to Implementation Time
· 从想法到实施时间
· Documentation Quality · 文档质量
· Adherence to standards · 遵循标准
· % Unit Tests Coverage
· 单元测试覆盖率
· Num of Post Release defects
· 发布后缺陷数量
· Number of releases per sprint
· 每个迭代的发布次数
· New resource onboarding
· 新资源上线
· Avg time to resolve an issue (SRE)
· 平均解决问题时间(SRE)
2) Defining Maturity Levels for phased transition
2) 定义成熟度等级以分阶段过渡
Define & adopt AI powered SDLC through maturity level-based approach over a period. You can define the levels and what that level means as per your organization’s needs and interests. In addition, identify a timeline for achieving each level which could be defined based on your organization budget and desire to transition to AI SDLC. Below is an illustration of such levels and indicative timelines.
通过基于成熟度等级的方法定义并采用 AI 驱动的 SDLC。您可以根据组织的具体需求和兴趣来定义这些等级及其含义。此外,确定每个等级的实现时间表,这可以基于组织的预算以及向 AI SDLC 过渡的意愿来定义。以下是一些等级及其示例时间表的示例。

Figure 3: Maturity Levels for AI Powered SDLC Transformation 图 3:AI 驱动 SDLC 转型的成熟度等级
One key success factor is leveraging custom trained Codex tools and active learning for the Codex tools while ensuring their alignment (helpfulness, harmlessness & honesty). This can be achieved by training the models on your datasets and leveraging Reinforcement Learning with Human Feedback (RLHF) for alignment to maximize Codex tools benefits. This is depicted in Level 3 in the above maturity pyramid.
一个关键的成功因素是利用自定义训练的 Codex 工具和主动学习,同时确保这些工具的对齐(有用性、无害性和诚实性)。这可以通过在您的数据集上训练模型,并利用人类反馈强化学习(RLHF)进行对齐来实现,以最大化 Codex 工具的好处。这在上述成熟度金字塔的第 3 级有所体现。
In addition, you need to continually retrain Codex tools on your latest datasets, so you need to have a framework such as LLMOps. You would need LLMOps specific tools such as vector databases, LLM benchmarking tools, RHLF frameworks such as OpenAI Gymn or kerasrl or TRL ( Transform Reinforcement Learning )
此外,您需要不断使用最新的数据集重新训练 Codex 工具,因此您需要有一个如 LLMOps 的框架。您需要特定于 LLMOps 的工具,如向量数据库、LLM 基准测试工具、OpenAI Gymn、kerasrl 或 TRL(Transform Reinforcement Learning)等 RLHF 框架。
3) Identifying & integrating the AI codex tools and provisioning them for use
3) 识别并集成 AI Codex 工具,并为使用做好准备
There are several open source and closed-source tools & models available. Below is a list of few popular tools.
有多种开源和闭源的工具与模型可供选择。以下是几种流行的工具。
Codex Tools Codex 工具
• OpenAI Codex ( Github)
• OpenAI Codex (Github)
• CodeT5 ( Salesforce )
• CodeT5 (Salesforce)
• Starcoder ( Service Now)
• Starcoder (Service Now)
• Vertx Codey ( Google tools)
• Vertx Codey (Google tools)
• LLaMA models ( facebook)
• LLaMA 模型 (Facebook)
LLM Models LLM 模型
· GPT3/4 from OpenAI · GPT3/4 来自 OpenAI
· LLaMA2 ( facebook) · LLaMA2(Facebook)
Chatbots for Knowledge Bank
知识库聊天机器人
· fastChat
· GPTChat ( OpenAI) · GPTChat (OpenAI)
Orchestration Tools 编排工具
· Langchain
· Langflow
IDE Plugins: There are plugins coming up for the popular tools such as Jira, IntelliJ, VS Code already. You can consider them or develop your own plugins to integrate Codex Tools in your Organization SDLC workflow.
IDE 插件:对于像 Jira、IntelliJ、VS Code 这样的流行工具,已经有了一些插件。你可以考虑使用这些插件,或者开发自己的插件来将 Codex Tools 集成到你们组织的 SDLC 流程中。
4) Continuously evaluate and transition to AI powered SDLC
4) 持续评估并过渡到 AI 驱动的 SDLC
Continuously evaluate the overall approach and measure success at each phase/level using some of the metrics suggested above.
持续评估整体方法,并在每个阶段/级别使用上述建议的一些指标来衡量成功。
Expected Results 预期结果
With a defined strategy and proper execution, we should be able to achieve significant efficiencies esp. around 25–50% improvement in overall Software Development Time and Effort.
有了明确的策略和适当的执行,我们应该能够在整体软件开发时间和努力方面实现显著的效率提升,特别是在 25–50%的改进方面。
Below is one example from a study at Duolingo using Github Co-Pilot:
以下是 Duolingo 的一项研究中使用 Github Co-Pilot 的一个例子:
· 25% increase in developer speed for those who are new to working with a specific repository.
· 对于首次使用特定代码仓库的新开发者,开发速度提高了 25%。
· 10% increase in developer speed for those who are familiar with the respective codebase.
· 对于熟悉相应代码库的开发者,开发速度提高了 10%。
· 67% decrease in median code review turnaround time.
· 中位数代码审查周转时间减少了 67%。

Source: https://github.blog/2023-05-09-how-companies-are-boosting-productivity-with-generative-ai/
源: https://github.blog/2023-05-09-how-companies-are-boosting-productivity-with-generative-ai/
Watch Items & Risks 关注项与风险
While integrating AI Codex and pair programming tools can bring substantial benefits, there are potential risks and challenges to consider:
将 AI Codex 和配对编程工具集成虽然能带来显著的好处,但也需要考虑潜在的风险和挑战:
- Data Security: Ensure that sensitive data and proprietary code remain protected when using AI-powered tools.
数据安全:在使用 AI 驱动的工具时,确保敏感数据和专有代码得到保护。 - Accuracy of Suggestions: There might be instances where AI-generated code suggestions are not accurate or aligned with the intended functionality.
建议的准确性:AI 生成的代码建议可能存在不准确或与预期功能不一致的情况。 - Lack of Human Creativity: Overreliance on AI might hinder the creativity and innovative thinking of developers.
缺乏人类创造力:过度依赖 AI 可能会阻碍开发人员的创造力和创新思维。
Conclusion 结论
Adopting Codex and AI pair programming tools offers a transformative opportunity for large enterprises to streamline their software development processes. By integrating AI into various stages of the SDLC, organizations can unlock higher productivity, improved code quality, and faster time-to-market. While challenges may arise, the potential benefits far outweigh the initial complexities. With the right strategy, Codex and AI-powered tools can revolutionize the way software development is approached, propelling organizations to new levels of innovation and efficiency.
采用 Codex 和 AI 配对编程工具为大型企业提供了变革的机会,以简化其软件开发流程。通过将 AI 集成到 SDLC 的各个阶段,组织可以提高生产率、提升代码质量并加快上市时间。虽然可能会遇到挑战,但潜在的好处远远超过了初始的复杂性。通过正确的策略,Codex 和 AI 驱动的工具可以彻底改变软件开发的方式,推动组织达到新的创新和效率水平。
Recommended from Medium 推荐来自 Medium
[
See more recommendations