Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Support for Other Filing Types #88

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

akaashdash
Copy link

Working on #87

So far:

  • Abstracted top section data class
  • Abstracted top section manager
  • Updated 10Q section list
  • Created 10K section list
  • Updated 10Q top section manager
  • Created 10K top section manager
  • Updated existing and dependencies, etc

To do:

  • Much more test coverage
  • Add tests for 10-K
  • Verify everything works
  • Double check that empty part names are not an issue

@akaashdash akaashdash requested a review from Elijas as a code owner December 17, 2024 07:23
@akaashdash
Copy link
Author

Completed:

  • Verified everything works
  • Double checked that empty part names are not an issue
  • Verified existing 10-Q implementation is effectively unchanged
  • Fixed formatting issues
  • updated 10-k parts to match sec standard: https://www.sec.gov/files/form10-k.pdf
  • Exposed Edgar10KParser to package
  • Updated part and item matching to support extended range found in files besides 10-Q

To-Do:

  • Add tests for 10-K

@akaashdash
Copy link
Author

To-Do:

  • Add tests for 10-K

This kinda falls under #67 and #48 as both still apply, just need to expand the same to 10-Ks

The only thing I can think of from here is that Edgar10QParser and Edgar10KParser are currently duplicated code. assuming all files will follow the same steps, I could abstract it out in a similar to how I did TopSectionManager (but do we even want to do this?

@INF800
Copy link
Contributor

INF800 commented Dec 18, 2024

Hi @akaashdash, thanks for the amazing work! 🚀⚡

I have a couple of questions:

  1. In the latest commit, are the outputs of Edgar10KParser and Edgar10QParser essentially same?
  2. Are you testing your code with some set of 10-K filings? And is the unit tests accuracy of 10-Q filings same as it was before the new changes?

@akaashdash
Copy link
Author

Hi @akaashdash, thanks for the amazing work! 🚀⚡

I have a couple of questions:

  1. In the latest commit, are the outputs of Edgar10KParser and Edgar10QParser essentially same?
  2. Are you testing your code with some set of 10-K filings? And is the unit tests accuracy of 10-Q filings same as it was before the new changes?

Edgar10KParser and Edgar10QParser are the same except one uses TopSectionManagerFor10K and the other uses TopSectionManagerFor10Q. At its core, both TopSectionManagers use the same code, just the list of headings is changed. Therefore, both will have the same output when ran on their respective filings. However, using a 10Q on the 10K parser or a 10K on a 10Q parser will still error out, but this is expected behavior as this is also how it was before my changes.

@akaashdash
Copy link
Author

Hi @akaashdash, thanks for the amazing work! 🚀⚡

I have a couple of questions:

  1. In the latest commit, are the outputs of Edgar10KParser and Edgar10QParser essentially same?
  2. Are you testing your code with some set of 10-K filings? And is the unit tests accuracy of 10-Q filings same as it was before the new changes?

I forgot to answer the second part sorry. I was testing my code manually, but unit tests for 10Ks need to be added. Not sure how to do it without breaking existing tests, some help on this would be appreciated.

All tests for 10Q have the exact same results as before, no changes on that front.

@INF800
Copy link
Contributor

INF800 commented Dec 21, 2024

Hi @akaashdash thanks for the answering the questions. I would strongly recommend writing tests for 10-K parser like we did for 10-Q parser because we need to cover extremely large number of filings / edge cases.

.. Not sure how to do it without breaking existing tests, some help on this would be appreciated.

Please go through Discussion 48 to understand how we have setup tests for 10-Q (or parsing test in general). Let me know if something doesn't make sense!

@akaashdash
Copy link
Author

Completed:

  • Added exploratory tests for 10-K
  • Confirmed that all unit/regression/etc tests still perform the exact same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants