Fix competency extraction feature #145

MichaelOwenDyer · 2024-08-09T00:52:28Z

Summary by CodeRabbit

New Features
- Introduced a competency extraction pipeline to enhance the processing of course descriptions.
- Added new API endpoints for running competency extraction asynchronously.
- Implemented a structured AI prompt for extracting competencies from course descriptions.
- Introduced a new class for managing competency extraction status updates.
- Enhanced data validation capabilities with new competency data models.
- Simplified state management in the status update system for competency extraction.
Bug Fixes
- Streamlined data structures in various data transfer objects for improved clarity and performance.
Documentation
- Enhanced docstring readability in the course chat pipeline without altering functionality.

Some manual notes:

I decided not to use LangChain to implement this feature, I just call the CapabilityRequestHandler directly
The original implementation of this feature called the LLM in a loop for each additional competency, which was horrible for performance. Now, I just call the LLM once and parse many competencies out of the returned string. In my testing, the pipeline takes only a second or two to complete now.
This PR contains only two changes to existing code: I de-duplicated the settings and initialStages attributes out of the subclasses of PipelineExecutionDTO into the superclass. This allows me to use PipelineExecutionDTO in my new CompetencyExtractionPipelineExecutionDTO (I opted for composition, not inheritance)

coderabbitai · 2024-08-09T00:52:38Z

Walkthrough

The changes significantly enhance the competency extraction capabilities of the application by introducing and refining data transfer objects (DTOs), implementing new pipeline classes, and adding API endpoints. The updates improve the architecture, ensure data integrity, and streamline the competency extraction process, facilitating more effective educational assessments and user interactions. Additionally, the status update mechanisms have been simplified, enhancing overall responsiveness.

Changes

Files and Folders	Change Summary
`app/domain/__init__.py`, `app/domain/chat/chat_pipeline_execution_dto.py`, `app/domain/ingestion/ingestion_pipeline_execution_dto.py`, `app/domain/pipeline_execution_dto.py`	Various attributes removed or added in DTOs to streamline competency extraction processes.
`app/domain/competency_extraction_pipeline_execution_dto.py`, `app/domain/status/competency_extraction_status_update_dto.py`, `app/domain/data/competency_dto.py`	Introduced new DTOs (`CompetencyExtractionPipelineExecutionDTO` and `CompetencyExtractionStatusUpdateDTO`) and `Competency` class for managing extraction data and validation.
`app/pipeline/__init__.py`, `app/pipeline/competency_extraction_pipeline.py`, `app/pipeline/prompts/competency_extraction.py`	Implemented new pipeline class and system prompts for competency extraction logic and AI model interaction.
`app/web/routers/pipelines.py`, `app/web/status/status_update.py`	Added new API endpoints and classes (like `CompetencyExtractionCallback`) to enhance status management and execution of competency extraction.
`app/llm/external/openai_chat.py`	Removed an import for `ResponseFormat`, indicating a shift in how response formats are handled in chat processing.
`app/pipeline/chat/course_chat_pipeline.py`	Modified docstring of `lecture_content_retrieval` for improved readability.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai generate interesting stats about this repository and render them as a table.
- @coderabbitai show all the console.log statements in this repository.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 7

Outside diff range, codebase verification and nitpick comments (1)

app/domain/chat/chat_pipeline_execution_dto.py (1)
5-5: Impact of Removing settings and initial_stages:

The removal of settings and initial_stages from the DTO affects several parts of the application. These attributes are referenced in multiple files, indicating their active role in the application's logic. Here are some key locations where they are used:

app/pipeline/lecture_ingestion_pipeline.py

app/pipeline/chat/lecture_chat_pipeline.py

app/web/status/status_update.py

app/web/routers/pipelines.py

app/web/routers/webhooks.py

Please ensure these dependencies are addressed to avoid breaking changes.

Analysis chain

Verify the impact of removing settings and initial_stages.

The removal of these attributes simplifies the DTO, but ensure that their absence does not affect other parts of the application.
Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the usage of `settings` and `initial_stages` in the codebase.

# Test: Search for any references to the removed attributes. Expect: No references.
rg --type python 'settings|initial_stages'
Length of output: 75

Script:
#!/bin/bash
# Corrected script to search for usage of `settings` and `initial_stages` in Python files.

# Search for any references to the removed attributes in Python files.
rg 'settings|initial_stages' --glob '*.py' -A 3
Length of output: 9614

app/domain/__init__.py

app/pipeline/__init__.py

app/domain/data/competency_dto.py

app/pipeline/competency_extraction_pipeline.py

app/web/routers/pipelines.py

app/web/status/status_update.py

bassner

Code LGTM

MichaelOwenDyer added 3 commits August 8, 2024 19:28

Competency extraction

9ea513b

Merge branch 'refs/heads/main' into feature/competency-extraction

bc2935d

Fix typos

97d4c24

github-actions bot added component:Pipeline component:FastAPI component:Domain labels Aug 9, 2024

Remove debug print statement

15c5a51

coderabbitai bot requested changes Aug 9, 2024

View reviewed changes

Apply coderabbit suggestions

dcf3510

MichaelOwenDyer requested review from coderabbitai[bot], Hialus, bassner, kaancayli and yassinsws and removed request for coderabbitai[bot] August 9, 2024 12:53

Format

41df050

coderabbitai bot previously approved these changes Aug 9, 2024

View reviewed changes

Add hard limit on generated competencies to max_n

117a6e2

MichaelOwenDyer dismissed coderabbitai[bot]’s stale review via 117a6e2 August 9, 2024 13:24

coderabbitai bot previously approved these changes Aug 9, 2024

View reviewed changes

MichaelOwenDyer added the deploy:pyris-test label Aug 9, 2024

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Aug 9, 2024

MichaelOwenDyer temporarily deployed to Iris Test August 9, 2024 18:28 — with GitHub Actions Inactive

Do not throw ValueError when calling callback.done()

a202b2c

MichaelOwenDyer dismissed coderabbitai[bot]’s stale review via a202b2c August 9, 2024 19:20

coderabbitai bot previously approved these changes Aug 9, 2024

View reviewed changes

Tweak prompt

6b81d36

MichaelOwenDyer dismissed coderabbitai[bot]’s stale review via 6b81d36 August 9, 2024 20:45

github-actions bot added the lock:pyris-test label Aug 21, 2024

MichaelOwenDyer temporarily deployed to Iris Test August 21, 2024 14:38 — with GitHub Actions Inactive

Fix dumb import path

3507f45

MichaelOwenDyer dismissed coderabbitai[bot]’s stale review via 3507f45 August 21, 2024 14:55

coderabbitai bot previously approved these changes Aug 21, 2024

View reviewed changes

MichaelOwenDyer added deploy:pyris-test and removed lock:pyris-test labels Aug 21, 2024

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Aug 21, 2024

MichaelOwenDyer temporarily deployed to Iris Test August 21, 2024 15:28 — with GitHub Actions Inactive

Remove unused import causing Black to fail

38dcfe3

MichaelOwenDyer dismissed coderabbitai[bot]’s stale review via 38dcfe3 August 21, 2024 15:37

github-actions bot added the component:LLM label Aug 21, 2024

coderabbitai bot approved these changes Aug 21, 2024

View reviewed changes

MichaelOwenDyer added deploy:pyris-test and removed lock:pyris-test labels Aug 21, 2024

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Aug 21, 2024

MichaelOwenDyer temporarily deployed to Iris Test August 21, 2024 16:40 — with GitHub Actions Inactive

bassner approved these changes Aug 27, 2024

View reviewed changes

bassner added deploy:pyris-test and removed lock:pyris-test labels Aug 27, 2024

Merge branch 'main' into feature/competency-extraction

9782fb9

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Aug 27, 2024

bassner removed the lock:pyris-test label Aug 27, 2024

bassner merged commit 5765c9c into main Aug 27, 2024
5 checks passed

bassner deleted the feature/competency-extraction branch August 27, 2024 11:38

This was referenced Oct 7, 2024

Exercise Chat: Implement native function calling agent #154

Open

Track token usage of iris requests #165

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix competency extraction feature #145

Fix competency extraction feature #145

MichaelOwenDyer commented Aug 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 9, 2024 •

edited

Loading

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

bassner left a comment

Fix competency extraction feature #145

Fix competency extraction feature #145

Conversation

MichaelOwenDyer commented Aug 9, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Aug 9, 2024 • edited Loading

Walkthrough

Changes

Chat

CodeRabbit Commands (invoked as PR comments)

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

bassner left a comment

Choose a reason for hiding this comment

MichaelOwenDyer commented Aug 9, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 9, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)