diff --git a/.github/workflows/book.yml b/.github/workflows/book.yml new file mode 100644 index 0000000..621bc89 --- /dev/null +++ b/.github/workflows/book.yml @@ -0,0 +1,38 @@ +name: deploy-book + +# Only run this when the master branch changes +on: + push: + branches: + - master + - main + +# This job installs dependencies, build the book, and pushes it to `gh-pages` +jobs: + deploy-book: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v2 + + # Install dependencies + - name: Set up Python 3.7 + uses: actions/setup-python@v1 + with: + python-version: 3.7 + + - name: Install dependencies + run: | + pip install -r requirements.txt + + + # Build the book + - name: Build the book + run: | + jupyter-book build docs/ + + # Push the book's HTML to github-pages + - name: GitHub Pages action + uses: peaceiris/actions-gh-pages@v3.5.9 + with: + github_token: ${{ secrets.GITHUB_TOKEN }} + publish_dir: ./docs/_build/html \ No newline at end of file diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..16ec5a3 --- /dev/null +++ b/.gitignore @@ -0,0 +1,24 @@ +.DS_Store +.idea +.ipynb_checkpoints + +P1_H_C3H_M004_17-cropped.zarr +P1_H_C3H_M004_17-processed.zarr + +temp* + + +docs/20a_pixel_classification/test.cl +docs/20a_pixel_classification/my_model.cl +docs/20a_pixel_classification/test2.cl +*.pyc +data/CalibZAPWfixed_000154_max_labels.tif +data/y293-Gal4_vmat-GFP-f01.lif +data/tif_folder/* +schematics/Jaccard_index_sparse.png +schematics/Jaccard_index_binary.png +docs/29_algorithm_validation/ideas.ipynb +docs/29_algorithm_validation/solution for exercise - metrics to investigate segmentation results.ipynb +docs/22_feature_extraction/blobs_analysis.csv +data/S-BIAD634 +docs/71_fine_tuning_hf/haesleinhuepf \ No newline at end of file diff --git a/CODE-OF-CONDUCT.md b/CODE-OF-CONDUCT.md new file mode 100644 index 0000000..9a59848 --- /dev/null +++ b/CODE-OF-CONDUCT.md @@ -0,0 +1,77 @@ + +# Contributor Covenant Code of Conduct + +## Our Pledge + +In the interest of fostering an open and welcoming environment, we as +contributors and maintainers pledge to make participation in our project and +our community a harassment-free experience for everyone, regardless of age, body +size, disability, ethnicity, sex characteristics, gender identity and expression, +level of experience, education, socio-economic status, nationality, personal +appearance, race, religion, or sexual identity and orientation. + +## Our Standards + +Examples of behavior that contributes to creating a positive environment +include: + +* Using welcoming and inclusive language +* Being respectful of differing viewpoints and experiences +* Gracefully accepting constructive criticism +* Focusing on what is best for the community +* Showing empathy towards other community members + +Examples of unacceptable behavior by participants include: + +* The use of sexualized language or imagery and unwelcome sexual attention or + advances +* Trolling, insulting/derogatory comments, and personal or political attacks +* Public or private harassment +* Publishing others' private information, such as a physical or electronic + address, without explicit permission +* Other conduct which could reasonably be considered inappropriate in a + professional setting + +## Our Responsibilities + +Project maintainers are responsible for clarifying the standards of acceptable +behavior and are expected to take appropriate and fair corrective action in +response to any instances of unacceptable behavior. + +Project maintainers have the right and responsibility to remove, edit, or +reject comments, commits, code, wiki edits, issues, and other contributions +that are not aligned to this Code of Conduct, or to ban temporarily or +permanently any contributor for other behaviors that they deem inappropriate, +threatening, offensive, or harmful. + +## Scope + +This Code of Conduct applies within all project spaces, and it also applies when +an individual is representing the project or its community in public spaces. +Examples of representing a project or community include using an official +project e-mail address, posting via an official social media account, or acting +as an appointed representative at an online or offline event. Representation of +a project may be further defined and clarified by project maintainers. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be +reported by contacting the project lead robert.haase@tu-dresden.de. All +complaints will be reviewed and investigated and will result in a response that +is deemed necessary and appropriate to the circumstances. The project team is +obligated to maintain confidentiality with regard to the reporter of an incident. +Further details of specific enforcement policies may be posted separately. + +Project maintainers who do not follow or enforce the Code of Conduct in good +faith may face temporary or permanent repercussions as determined by other +members of the project's leadership. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, +available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html + +[homepage]: https://www.contributor-covenant.org + +For answers to common questions about this code of conduct, see +https://www.contributor-covenant.org/faq diff --git a/LICENSE-CC-BY b/LICENSE-CC-BY new file mode 100644 index 0000000..52bd145 --- /dev/null +++ b/LICENSE-CC-BY @@ -0,0 +1,395 @@ +Attribution 4.0 International + +======================================================================= + +Creative Commons Corporation ("Creative Commons") is not a law firm and +does not provide legal services or legal advice. Distribution of +Creative Commons public licenses does not create a lawyer-client or +other relationship. Creative Commons makes its licenses and related +information available on an "as-is" basis. Creative Commons gives no +warranties regarding its licenses, any material licensed under their +terms and conditions, or any related information. Creative Commons +disclaims all liability for damages resulting from their use to the +fullest extent possible. + +Using Creative Commons Public Licenses + +Creative Commons public licenses provide a standard set of terms and +conditions that creators and other rights holders may use to share +original works of authorship and other material subject to copyright +and certain other rights specified in the public license below. The +following considerations are for informational purposes only, are not +exhaustive, and do not form part of our licenses. + + Considerations for licensors: Our public licenses are + intended for use by those authorized to give the public + permission to use material in ways otherwise restricted by + copyright and certain other rights. Our licenses are + irrevocable. Licensors should read and understand the terms + and conditions of the license they choose before applying it. + Licensors should also secure all rights necessary before + applying our licenses so that the public can reuse the + material as expected. Licensors should clearly mark any + material not subject to the license. This includes other CC- + licensed material, or material used under an exception or + limitation to copyright. More considerations for licensors: + wiki.creativecommons.org/Considerations_for_licensors + + Considerations for the public: By using one of our public + licenses, a licensor grants the public permission to use the + licensed material under specified terms and conditions. If + the licensor's permission is not necessary for any reason--for + example, because of any applicable exception or limitation to + copyright--then that use is not regulated by the license. Our + licenses grant only permissions under copyright and certain + other rights that a licensor has authority to grant. Use of + the licensed material may still be restricted for other + reasons, including because others have copyright or other + rights in the material. A licensor may make special requests, + such as asking that all changes be marked or described. + Although not required by our licenses, you are encouraged to + respect those requests where reasonable. More_considerations + for the public: + wiki.creativecommons.org/Considerations_for_licensees + +======================================================================= + +Creative Commons Attribution 4.0 International Public License + +By exercising the Licensed Rights (defined below), You accept and agree +to be bound by the terms and conditions of this Creative Commons +Attribution 4.0 International Public License ("Public License"). To the +extent this Public License may be interpreted as a contract, You are +granted the Licensed Rights in consideration of Your acceptance of +these terms and conditions, and the Licensor grants You such rights in +consideration of benefits the Licensor receives from making the +Licensed Material available under these terms and conditions. + + +Section 1 -- Definitions. + + a. Adapted Material means material subject to Copyright and Similar + Rights that is derived from or based upon the Licensed Material + and in which the Licensed Material is translated, altered, + arranged, transformed, or otherwise modified in a manner requiring + permission under the Copyright and Similar Rights held by the + Licensor. For purposes of this Public License, where the Licensed + Material is a musical work, performance, or sound recording, + Adapted Material is always produced where the Licensed Material is + synched in timed relation with a moving image. + + b. Adapter's License means the license You apply to Your Copyright + and Similar Rights in Your contributions to Adapted Material in + accordance with the terms and conditions of this Public License. + + c. Copyright and Similar Rights means copyright and/or similar rights + closely related to copyright including, without limitation, + performance, broadcast, sound recording, and Sui Generis Database + Rights, without regard to how the rights are labeled or + categorized. For purposes of this Public License, the rights + specified in Section 2(b)(1)-(2) are not Copyright and Similar + Rights. + + d. Effective Technological Measures means those measures that, in the + absence of proper authority, may not be circumvented under laws + fulfilling obligations under Article 11 of the WIPO Copyright + Treaty adopted on December 20, 1996, and/or similar international + agreements. + + e. Exceptions and Limitations means fair use, fair dealing, and/or + any other exception or limitation to Copyright and Similar Rights + that applies to Your use of the Licensed Material. + + f. Licensed Material means the artistic or literary work, database, + or other material to which the Licensor applied this Public + License. + + g. Licensed Rights means the rights granted to You subject to the + terms and conditions of this Public License, which are limited to + all Copyright and Similar Rights that apply to Your use of the + Licensed Material and that the Licensor has authority to license. + + h. Licensor means the individual(s) or entity(ies) granting rights + under this Public License. + + i. Share means to provide material to the public by any means or + process that requires permission under the Licensed Rights, such + as reproduction, public display, public performance, distribution, + dissemination, communication, or importation, and to make material + available to the public including in ways that members of the + public may access the material from a place and at a time + individually chosen by them. + + j. Sui Generis Database Rights means rights other than copyright + resulting from Directive 96/9/EC of the European Parliament and of + the Council of 11 March 1996 on the legal protection of databases, + as amended and/or succeeded, as well as other essentially + equivalent rights anywhere in the world. + + k. You means the individual or entity exercising the Licensed Rights + under this Public License. Your has a corresponding meaning. + + +Section 2 -- Scope. + + a. License grant. + + 1. Subject to the terms and conditions of this Public License, + the Licensor hereby grants You a worldwide, royalty-free, + non-sublicensable, non-exclusive, irrevocable license to + exercise the Licensed Rights in the Licensed Material to: + + a. reproduce and Share the Licensed Material, in whole or + in part; and + + b. produce, reproduce, and Share Adapted Material. + + 2. Exceptions and Limitations. For the avoidance of doubt, where + Exceptions and Limitations apply to Your use, this Public + License does not apply, and You do not need to comply with + its terms and conditions. + + 3. Term. The term of this Public License is specified in Section + 6(a). + + 4. Media and formats; technical modifications allowed. The + Licensor authorizes You to exercise the Licensed Rights in + all media and formats whether now known or hereafter created, + and to make technical modifications necessary to do so. The + Licensor waives and/or agrees not to assert any right or + authority to forbid You from making technical modifications + necessary to exercise the Licensed Rights, including + technical modifications necessary to circumvent Effective + Technological Measures. For purposes of this Public License, + simply making modifications authorized by this Section 2(a) + (4) never produces Adapted Material. + + 5. Downstream recipients. + + a. Offer from the Licensor -- Licensed Material. Every + recipient of the Licensed Material automatically + receives an offer from the Licensor to exercise the + Licensed Rights under the terms and conditions of this + Public License. + + b. No downstream restrictions. You may not offer or impose + any additional or different terms or conditions on, or + apply any Effective Technological Measures to, the + Licensed Material if doing so restricts exercise of the + Licensed Rights by any recipient of the Licensed + Material. + + 6. No endorsement. Nothing in this Public License constitutes or + may be construed as permission to assert or imply that You + are, or that Your use of the Licensed Material is, connected + with, or sponsored, endorsed, or granted official status by, + the Licensor or others designated to receive attribution as + provided in Section 3(a)(1)(A)(i). + + b. Other rights. + + 1. Moral rights, such as the right of integrity, are not + licensed under this Public License, nor are publicity, + privacy, and/or other similar personality rights; however, to + the extent possible, the Licensor waives and/or agrees not to + assert any such rights held by the Licensor to the limited + extent necessary to allow You to exercise the Licensed + Rights, but not otherwise. + + 2. Patent and trademark rights are not licensed under this + Public License. + + 3. To the extent possible, the Licensor waives any right to + collect royalties from You for the exercise of the Licensed + Rights, whether directly or through a collecting society + under any voluntary or waivable statutory or compulsory + licensing scheme. In all other cases the Licensor expressly + reserves any right to collect such royalties. + + +Section 3 -- License Conditions. + +Your exercise of the Licensed Rights is expressly made subject to the +following conditions. + + a. Attribution. + + 1. If You Share the Licensed Material (including in modified + form), You must: + + a. retain the following if it is supplied by the Licensor + with the Licensed Material: + + i. identification of the creator(s) of the Licensed + Material and any others designated to receive + attribution, in any reasonable manner requested by + the Licensor (including by pseudonym if + designated); + + ii. a copyright notice; + + iii. a notice that refers to this Public License; + + iv. a notice that refers to the disclaimer of + warranties; + + v. a URI or hyperlink to the Licensed Material to the + extent reasonably practicable; + + b. indicate if You modified the Licensed Material and + retain an indication of any previous modifications; and + + c. indicate the Licensed Material is licensed under this + Public License, and include the text of, or the URI or + hyperlink to, this Public License. + + 2. You may satisfy the conditions in Section 3(a)(1) in any + reasonable manner based on the medium, means, and context in + which You Share the Licensed Material. For example, it may be + reasonable to satisfy the conditions by providing a URI or + hyperlink to a resource that includes the required + information. + + 3. If requested by the Licensor, You must remove any of the + information required by Section 3(a)(1)(A) to the extent + reasonably practicable. + + 4. If You Share Adapted Material You produce, the Adapter's + License You apply must not prevent recipients of the Adapted + Material from complying with this Public License. + + +Section 4 -- Sui Generis Database Rights. + +Where the Licensed Rights include Sui Generis Database Rights that +apply to Your use of the Licensed Material: + + a. for the avoidance of doubt, Section 2(a)(1) grants You the right + to extract, reuse, reproduce, and Share all or a substantial + portion of the contents of the database; + + b. if You include all or a substantial portion of the database + contents in a database in which You have Sui Generis Database + Rights, then the database in which You have Sui Generis Database + Rights (but not its individual contents) is Adapted Material; and + + c. You must comply with the conditions in Section 3(a) if You Share + all or a substantial portion of the contents of the database. + +For the avoidance of doubt, this Section 4 supplements and does not +replace Your obligations under this Public License where the Licensed +Rights include other Copyright and Similar Rights. + + +Section 5 -- Disclaimer of Warranties and Limitation of Liability. + + a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE + EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS + AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF + ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, + IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, + WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR + PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, + ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT + KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT + ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. + + b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE + TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, + NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, + INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, + COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR + USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN + ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR + DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR + IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. + + c. The disclaimer of warranties and limitation of liability provided + above shall be interpreted in a manner that, to the extent + possible, most closely approximates an absolute disclaimer and + waiver of all liability. + + +Section 6 -- Term and Termination. + + a. This Public License applies for the term of the Copyright and + Similar Rights licensed here. However, if You fail to comply with + this Public License, then Your rights under this Public License + terminate automatically. + + b. Where Your right to use the Licensed Material has terminated under + Section 6(a), it reinstates: + + 1. automatically as of the date the violation is cured, provided + it is cured within 30 days of Your discovery of the + violation; or + + 2. upon express reinstatement by the Licensor. + + For the avoidance of doubt, this Section 6(b) does not affect any + right the Licensor may have to seek remedies for Your violations + of this Public License. + + c. For the avoidance of doubt, the Licensor may also offer the + Licensed Material under separate terms or conditions or stop + distributing the Licensed Material at any time; however, doing so + will not terminate this Public License. + + d. Sections 1, 5, 6, 7, and 8 survive termination of this Public + License. + + +Section 7 -- Other Terms and Conditions. + + a. The Licensor shall not be bound by any additional or different + terms or conditions communicated by You unless expressly agreed. + + b. Any arrangements, understandings, or agreements regarding the + Licensed Material not stated herein are separate from and + independent of the terms and conditions of this Public License. + + +Section 8 -- Interpretation. + + a. For the avoidance of doubt, this Public License does not, and + shall not be interpreted to, reduce, limit, restrict, or impose + conditions on any use of the Licensed Material that could lawfully + be made without permission under this Public License. + + b. To the extent possible, if any provision of this Public License is + deemed unenforceable, it shall be automatically reformed to the + minimum extent necessary to make it enforceable. If the provision + cannot be reformed, it shall be severed from this Public License + without affecting the enforceability of the remaining terms and + conditions. + + c. No term or condition of this Public License will be waived and no + failure to comply consented to unless expressly agreed to by the + Licensor. + + d. Nothing in this Public License constitutes or may be interpreted + as a limitation upon, or waiver of, any privileges and immunities + that apply to the Licensor or You, including from the legal + processes of any jurisdiction or authority. + + +======================================================================= + +Creative Commons is not a party to its public +licenses. Notwithstanding, Creative Commons may elect to apply one of +its public licenses to material it publishes and in those instances +will be considered the “Licensor.” The text of the Creative Commons +public licenses is dedicated to the public domain under the CC0 Public +Domain Dedication. Except for the limited purpose of indicating that +material is shared under a Creative Commons public license or as +otherwise permitted by the Creative Commons policies published at +creativecommons.org/policies, Creative Commons does not authorize the +use of the trademark "Creative Commons" or any other trademark or logo +of Creative Commons without its prior written consent including, +without limitation, in connection with any unauthorized modifications +to any of its public licenses or any other arrangements, +understandings, or agreements concerning use of licensed material. For +the avoidance of doubt, this paragraph does not form part of the +public licenses. + +Creative Commons may be contacted at creativecommons.org. \ No newline at end of file diff --git a/docs/10_prompting.ipynb b/docs/10_prompting.ipynb new file mode 100644 index 0000000..1f3534d --- /dev/null +++ b/docs/10_prompting.ipynb @@ -0,0 +1,247 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "9ee67a8d-c234-4427-91c9-a0dc7033657e", + "metadata": {}, + "source": [ + "# Prompting language models\n", + "In this notebook we will send basic prompts to ChatGPT and receive answers." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "43c6f1dc-fb3c-463f-a144-90b017031600", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Central Europe is a region that is not precisely defined and can vary based on different cultural, historical, and geopolitical perspectives. However, it traditionally includes the following countries:\n", + "\n", + "1. Germany\n", + "2. Poland\n", + "3. Czech Republic\n", + "4. Austria\n", + "5. Switzerland\n", + "6. Hungary\n", + "7. Slovakia\n", + "8. Slovenia\n", + "\n", + "Sometimes, broader definitions might also include parts of:\n", + "\n", + "- Croatia\n", + "- Romania\n", + "- Serbia\n", + "\n", + "These countries are often considered part of Central Europe due to historical ties, cultural similarities, and geographic proximity. Nonetheless, regional definitions can vary, and some countries can be considered part of more than one European region." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Which countries are in central Europe?" + ] + }, + { + "cell_type": "markdown", + "id": "a256a055-c5ef-4846-a370-006c8a80a119", + "metadata": {}, + "source": [ + "## A comment on reproducibility\n", + "Note that prompt responses may not be very reproducible. You can execute the same prompt twice and receive different responses." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "8269c2a0-0a68-452d-9c20-1cabdaf66e1d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Central Europe is a region that does not have a universally agreed-upon definition, but it traditionally includes several countries based on cultural, historical, and geographical criteria. Generally, the following countries are considered part of Central Europe:\n", + "\n", + "1. Germany\n", + "2. Poland\n", + "3. Czech Republic\n", + "4. Austria\n", + "5. Switzerland\n", + "6. Hungary\n", + "7. Slovakia\n", + "8. Slovenia\n", + "9. Liechtenstein\n", + "\n", + "Some definitions might also include parts of other countries such as northern Italy, eastern France, and western Ukraine, depending on the context. The exact composition can vary based on political, cultural, and economic considerations." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Which countries are in central Europe?" + ] + }, + { + "cell_type": "markdown", + "id": "e519d67e-38b0-48b6-a994-5f30dfd3bfe7", + "metadata": {}, + "source": [ + "## Precise results\n", + "As you can see, the language model has to speculate a bit because we haven't defined precisely, what region we refer to. \n", + "The more precise our prompt is, the more precise (and reproducible) will results be." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "6611e5ac-1a35-4e7b-b28b-d66a0f8df39a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "The Schengen Area, as of my last update, consisted of 27 European countries that had abolished passport control and other forms of border control at their mutual borders. While I cannot provide details specifically for 2024, the list has remained relatively stable over recent years. Here are the countries that were part of the Schengen Area:\n", + "\n", + "1. Austria\n", + "2. Belgium\n", + "3. Czech Republic\n", + "4. Denmark\n", + "5. Estonia\n", + "6. Finland\n", + "7. France\n", + "8. Germany\n", + "9. Greece\n", + "10. Hungary\n", + "11. Iceland\n", + "12. Italy\n", + "13. Latvia\n", + "14. Liechtenstein\n", + "15. Lithuania\n", + "16. Luxembourg\n", + "17. Malta\n", + "18. Netherlands\n", + "19. Norway\n", + "20. Poland\n", + "21. Portugal\n", + "22. Slovakia\n", + "23. Slovenia\n", + "24. Spain\n", + "25. Sweden\n", + "26. Switzerland\n", + "\n", + "Bulgaria, Croatia, Cyprus, and Romania are members of the European Union but were not part of the Schengen Area as of my last update. Please check the most current information to confirm if any changes have occurred since then." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Which countries are part of the Schengen region in 2024?" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "e9a64a5e-0a91-4f1f-b82e-8fc4afd70592", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "As of my last update, the Schengen Area comprises 27 European countries that have abolished passport and other types of border control at their mutual borders. The countries that are typically part of the Schengen Area include:\n", + "\n", + "1. Austria\n", + "2. Belgium\n", + "3. Czech Republic\n", + "4. Denmark\n", + "5. Estonia\n", + "6. Finland\n", + "7. France\n", + "8. Germany\n", + "9. Greece\n", + "10. Hungary\n", + "11. Iceland\n", + "12. Italy\n", + "13. Latvia\n", + "14. Liechtenstein\n", + "15. Lithuania\n", + "16. Luxembourg\n", + "17. Malta\n", + "18. Netherlands\n", + "19. Norway\n", + "20. Poland\n", + "21. Portugal\n", + "22. Slovakia\n", + "23. Slovenia\n", + "24. Spain\n", + "25. Sweden\n", + "26. Switzerland\n", + "\n", + "Additionally, Bulgaria, Croatia, Cyprus, and Romania are EU member states that are legally obliged to join the Schengen Area once they meet the necessary conditions, and Croatia joined the Schengen Area in 2023. \n", + "\n", + "Please verify with a current source to check for any changes in status or membership." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Which countries are part of the Schengen region in 2024?" + ] + }, + { + "cell_type": "markdown", + "id": "57478d74-542f-41ee-9e12-87e3c7d7b7f1", + "metadata": {}, + "source": [ + "## Exercise\n", + "Continue asking questions about well-known facts such as continents and countries. \n", + "Afterwards, ask philosophical questions such as for the meaning of life." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5decbd54-e27a-4e91-b88c-602290795b78", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/20_use_cases.ipynb b/docs/20_use_cases.ipynb new file mode 100644 index 0000000..baacc4e --- /dev/null +++ b/docs/20_use_cases.ipynb @@ -0,0 +1,301 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "0532bcef-9980-4ab7-b8d0-c331a4949cbf", + "metadata": {}, + "source": [ + "# Use cases\n", + "In this notebook we collect a few use-cases and not so useful use-cases for large language models." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "6d86a1a5-a4c8-4d8d-b762-884882eb819a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Was ist das Lieblingsweihnachtslied aller Eltern? Stille Nacht!" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Translate the following joke to German:\n", + "\n", + "What is every parent's favorite Christmas song? Silent Night!" + ] + }, + { + "cell_type": "markdown", + "id": "80a222b8-aac1-4d65-a678-cb0269b5c621", + "metadata": {}, + "source": [ + "## Information retrieval\n", + "Large Language Models are well-suited for extracting information from text. Therefore, you need to:\n", + "1. formulate your question\n", + "2. provide data\n", + "3. define the output format" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "f57c6027-c1f5-46fb-986a-abd5e083f3f0", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "* Number of students asked: 100 \n", + "* Cloud storage size: 250 GB" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Given a story, please restructure the contained data.\n", + "\n", + "Story:\n", + "The compute center of the university recently conducted a study to figure \n", + "out needs of students for computational resources. After interviewing 100 \n", + "students they concluded that most of them were happy with 250 GB of cloud\n", + "storage.\n", + "\n", + "Restructure the story above to extract these numbers:\n", + "* Number of students asked:\n", + "* Cloud storage size:" + ] + }, + { + "cell_type": "markdown", + "id": "23ea4506-dcf4-4747-9d20-ddef994c1ebc", + "metadata": {}, + "source": [ + "## Browsing for information\n", + "If you are working with an LLM that cannot search the internet, it may not be able to answer certain questions." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "d02f3047-b2d4-41ad-8275-eac8ac53f46c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "I'm unable to provide real-time weather updates. To find out if you'll need a jacket today, I recommend checking a reliable weather website or app like the Weather Channel or using a voice-activated assistant for the latest forecast in your area." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "How will be the wheather today? Will I need a jacket?" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "b6fc5028-a4b7-4ee4-b52b-757332775ee2", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "Today is October 28, 2023." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Which day is today?" + ] + }, + { + "cell_type": "markdown", + "id": "14319932-be45-45e7-b191-32a8fd9c5b93", + "metadata": {}, + "source": [ + "## Providing more information\n", + "If you provide more information, the LLMs results will fit better to your expectation." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "63d62b2d-7482-4963-98bd-ddce6bc26541", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Given that today is August 16th, 2024, and you're in Central Europe with the sun shining, it's likely to be warm. However, whether you need a jacket depends on the specific temperature and your personal preference for warmth. Generally, August in Central Europe tends to be warm, so if the temperatures are typical for this time of year, you might not need a jacket during the day. It's always a good idea to check the current temperature and forecast for a more specific recommendation." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Today is August 16th 2024. \n", + "I'm in Central Europe and the sun is shining. \n", + "Will I need a jacket today?" + ] + }, + { + "cell_type": "markdown", + "id": "e4da1504-4b6a-4a65-ace6-66ab0d2c1da9", + "metadata": {}, + "source": [ + "## Counting\n", + "Many contemporary LLMs have issues with counting objects." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "70367746-455d-4d0e-94b6-5e9133089b32", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "The word \"Woolloomooloo\" contains eight 'o's." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "How many o are in Woolloomooloo ?" + ] + }, + { + "cell_type": "markdown", + "id": "bf90c256-557a-4f56-9e96-8a047d2adce0", + "metadata": {}, + "source": [ + "## Other use-cases" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "8e4958e9-3268-4a7f-94fc-a92ee13500c1", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "I have 5 apples, 6 oranges and 4 potatoes. How many fruits to I have" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "I have 5 apples, 6 oranges and 4 potatoes. How many fruits to I have?" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "9acdb73c-bc47-41e7-b8dd-82f4c882ce96", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "If traveling from Aarhus to Copenhagen takes about 3.5 hours and from Aarhus to Odense takes 2 hours, then the remaining time from Odense to Copenhagen would be the difference between the two trips:\n", + "\n", + "3.5 hours (Aarhus to Copenhagen) - 2 hours (Aarhus to Odense) = 1.5 hours\n", + "\n", + "Therefore, it would take about 1.5 hours to travel from Copenhagen to Odense by car." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Traveling from Aarhus to Copenhagen by care takes about 3.5 hours.\n", + "From Aarhus to Odense it takes 2 hours.\n", + "\n", + "How long do I need from Copenhagen to Odense?" + ] + }, + { + "cell_type": "markdown", + "id": "7e98bbe0-ed42-41a9-8230-97fff380f78a", + "metadata": {}, + "source": [ + "## Exercise\n", + "Come up with two more examples: One where chatGPT works reproducibly, and one where it reproducibly answers wrong information or denies answering." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5827d332-7c95-4a38-8cec-717ac8d6d9b0", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/30_prompt_engineering.ipynb b/docs/30_prompt_engineering.ipynb new file mode 100644 index 0000000..a7cc9f5 --- /dev/null +++ b/docs/30_prompt_engineering.ipynb @@ -0,0 +1,272 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5fb1cd17-f320-4f59-a0e7-28fd1acd59c8", + "metadata": {}, + "source": [ + "# Prompt Engineering\n", + "\n", + "In this notebook we learn how prompts can be designed to carefully modify the output format. \n", + "The more precisely we formulate the desired output, the closer this output will be to our expectation.\n", + "The act of making the prompt more precise can be coined \"prompt engineering\"." + ] + }, + { + "cell_type": "markdown", + "id": "cf47989d-d12f-4530-8e88-38c06113cbef", + "metadata": {}, + "source": [ + "First let's ask for a simple task." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "7691d5af-803e-4b35-b76f-d54542674a72", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "When embarking on your thesis journey, it's essential to approach the project with careful planning and structured organization. Here are some key tips to guide you through the process:\n", + "\n", + "1. **Timing and Commitment**: Before you submit your thesis registration form, ensure you have a realistic timeline in mind. Be confident that you can complete the thesis within five months. This involves considering all your other obligations such as coursework, job responsibilities, or personal commitments, and determining how they may impact your ability to focus on your thesis.\n", + "\n", + "2. **Early Writing and Chapter Development**: Don't wait to start writing your thesis. Begin with the fundamentals or background chapter as soon as you can. This section often involves consolidating existing knowledge and can be drafted early in your process. Early writing not only clarifies your thinking but also helps you establish a writing routine. Utilize any preparatory research or coursework material you may have as a foundation for your thesis.\n", + "\n", + "3. **Reverse Engineering and Buffer Time**: Aim to complete your thesis draft one month before the submission deadline. This plan allows you critical buffer time to review and revise your work. Use this month to address any feedback from your advisor, conduct thorough proofreading, and make any necessary refinements. This period also gives you the leeway to manage unforeseen challenges without the pressure of the looming final deadline.\n", + "\n", + "By following these guidelines, you'll be better positioned to manage your time effectively, produce a high-quality thesis, and alleviate some of the stress associated with last-minute writing and revisions. Remember, thorough preparation and an early start can make a significant difference in the quality of your thesis and your overall academic experience." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a professional academic consultant who gives advice to students for writing their thesis.\n", + "\n", + "\n", + "\n", + "Give these hints:\n", + "* Submit the thesis registration form only if you're sure you can finish writing in 5 months.\n", + "* Start writing your thesis early, e.g. the fundamentals chapter can be written from day one.\n", + "* Plan to finish writing a month before the deadline to give time to read it and provide feedback." + ] + }, + { + "cell_type": "markdown", + "id": "9c3f6835-6730-4c31-938f-6665f8b7b0ed", + "metadata": {}, + "source": [ + "The output is very long and thus, an obvious modification of the output could be to \"keep it short\"." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "a4a5ff4c-4392-4b20-904d-f54b36550e8e", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "As a professional academic consultant, here are some helpful hints for writing your thesis:\n", + "\n", + "1. Only submit the thesis registration form if you are confident about completing it within the given timeframe of 5 months.\n", + "\n", + "2. Begin writing your thesis early to ensure a smooth process. For instance, you can start with the fundamentals chapter right from day one.\n", + "\n", + "3. Plan your schedule effectively to finish writing your thesis at least a month before the submission deadline. This will allow ample time for reading, revising, and receiving feedback.\n", + "\n", + "By following these recommendations, you'll have a better chance of successfully completing your thesis on time and with improved quality. Good luck!" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a professional academic consultant who gives advice to students for writing their thesis.\n", + "Keep it short.\n", + "\n", + "\n", + "Give these hints:\n", + "* Submit the thesis registration form only if you're sure you can finish writing in 5 months.\n", + "* Start writing your thesis early, e.g. the fundamentals chapter can be written from day one.\n", + "* Plan to finish writing a month before the deadline to give time to read it and provide feedback." + ] + }, + { + "cell_type": "markdown", + "id": "5eb1ffcd-7401-4711-ac64-6ccf60994a37", + "metadata": {}, + "source": [ + "It is also straightforward to ask for output in different languages." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "c6dc6451-29a6-4bb3-bdd5-7bc957ae0b4c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Als akademischer Berater rate ich Ihnen:\n", + "* Schicken Sie das Anmeldeformular für Ihre Abschlussarbeit nur dann ab, wenn Sie sicher sind, dass Sie innerhalb von 5 Monaten fertig schreiben können.\n", + "* Beginnen Sie frühzeitig mit dem Schreiben Ihrer Thesis, zum Beispiel kann das Kapitel über die Grundlagen von Anfang an geschrieben werden.\n", + "* Planen Sie, einen Monat vor der Abgabefrist mit dem Schreiben fertig zu sein, um genügend Zeit zum Korrekturlesen und Feedbackgeben zu haben." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a professional academic consultant who gives advice to students for writing their thesis.\n", + "Keep it short.\n", + "Povide advice in German.\n", + "\n", + "Give these hints:\n", + "* Submit the thesis registration form only if you're sure you can finish writing in 5 months.\n", + "* Start writing your thesis early, e.g. the fundamentals chapter can be written from day one.\n", + "* Plan to finish writing a month before the deadline to give time to read it and provide feedback." + ] + }, + { + "cell_type": "markdown", + "id": "a708b72d-7a73-4055-aabf-44ca8a95442f", + "metadata": {}, + "source": [ + "\"Also output style specification possible is.\", Yoda said." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "b4a523df-7711-4a34-b55e-dfcb9c3590a7", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Hints for thesis writing, I have. Follow them, you must.\n", + "\n", + "* Only submit the thesis registration form, if in 5 months, complete writing, you are sure.\n", + "* From day one, start writing your thesis, you must. Begin with the fundamentals chapter.\n", + "* A month before the deadline, finish writing, you should. Time to read and provide feedback, it will give you.\n", + "\n", + "Follow these hints, you must. Successful thesis writing, you will achieve. Hmmmm." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a professional academic consultant who gives advice to students for writing their thesis.\n", + "Keep it short.\n", + "Write like Yoda would.\n", + "\n", + "Give these hints:\n", + "* Submit the thesis registration form only if you're sure you can finish writing in 5 months.\n", + "* Start writing your thesis early, e.g. the fundamentals chapter can be written from day one.\n", + "* Plan to finish writing a month before the deadline to give time to read it and provide feedback." + ] + }, + { + "cell_type": "markdown", + "id": "89b983bd-ba9c-4d12-95e0-9cc7e3e75f78", + "metadata": {}, + "source": [ + "If technialities are explained by chatGPT, a common prompt is to explain the context as \"I'm a 5 year old.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "fc15487c-431c-4339-8fd7-8cffebbeb81a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Hallo! Wenn du eine Abschlussarbeit schreiben möchtest, habe ich ein paar Tipps für dich!\n", + "\n", + "1. Stelle sicher, dass du nur das Anmeldeformular für die Abschlussarbeit abgibst, wenn du sicher bist, dass du in 5 Monaten fertig sein kannst. Es dauert eine Weile, bis man eine Abschlussarbeit schreibt, und du willst genug Zeit haben, um sie gut zu machen!\n", + "\n", + "2. Fang früh an mit dem Schreiben deiner Abschlussarbeit. Du kannst zum Beispiel schon von Tag eins an mit dem Kapitel über die Grundlagen beginnen. So hast du mehr Zeit für den Rest der Arbeit.\n", + "\n", + "3. Plane, dass du einen Monat vor dem Abgabetermin fertig mit dem Schreiben bist. Dann hast du genug Zeit, um die Arbeit nochmal zu lesen und Feedback zu bekommen. Das ist wichtig, um sie noch besser zu machen.\n", + "\n", + "Ich hoffe, diese Tipps helfen dir weiter! Viel Erfolg bei deiner Abschlussarbeit!" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a professional academic consultant who gives advice to students for writing their thesis.\n", + "Keep it short.\n", + "Povide advice in German.\n", + "Explain it like I'm 5 year old.\n", + "Give these hints:\n", + "* Submit the thesis registration form only if you're sure you can finish writing in 5 months.\n", + "* Start writing your thesis early, e.g. the fundamentals chapter can be written from day one.\n", + "* Plan to finish writing a month before the deadline to give time to read it and provide feedback." + ] + }, + { + "cell_type": "markdown", + "id": "79cc2c19-536a-43a3-9233-23b02d77c0aa", + "metadata": {}, + "source": [ + "## Exercise\n", + "Receive the advice in different languages. Also consider languages such as Klingon." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d79a98bf-56e0-43de-b3e8-d5ab1ca4b883", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/40_code_generation.ipynb b/docs/40_code_generation.ipynb new file mode 100644 index 0000000..7500f85 --- /dev/null +++ b/docs/40_code_generation.ipynb @@ -0,0 +1,293 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "2ea8d0f5-484f-47bc-9b44-f752c7423c57", + "metadata": {}, + "source": [ + "# Code generation\n", + "LLMs are excellent code writers because writing code is a translation task, e.g. from English to Python." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "0bf9d182-d4f1-428e-a4af-a5dca6e509f5", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Certainly! You can print out the numbers between 0 and 10 using a simple `for` loop in Python. Here's the code to do that:\n", + "\n", + "```python\n", + "for number in range(11):\n", + " print(number)\n", + "```\n", + "\n", + "This code uses the `range` function to generate numbers from 0 to 10, and then iterates over each number to print it out." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Write Python code for printing out the numbers between 0 and 10." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "3905aca5-3a90-43de-890e-896f45a5414d", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Here is a simple Python function that returns the string \"Hello World\":\n", + "\n", + "```python\n", + "def get_hello_world():\n", + " return \"Hello World\"\n", + "\n", + "# Example usage\n", + "message = get_hello_world()\n", + "print(message)\n", + "```\n", + "\n", + "This code defines a function named `get_hello_world` that, when called, returns the string \"Hello World\". The example usage demonstrates how to call the function and print the result." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Write Python code for a function that returns \"Hello World\"." + ] + }, + { + "cell_type": "markdown", + "id": "0cff3ab7-4d76-4201-bc80-edc055f51f88", + "metadata": {}, + "source": [ + "## Explaining code\n", + "This also works the other way around, e.g. if you ask the LLM what a specific code section does." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "39969a5e-6ba4-4165-a277-b5f2d2ad3088", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "This Python code is manipulating a pandas DataFrame named `df`. Here’s a breakdown of what each line does:\n", + "\n", + "1. `df['aspect_ratio'] = df['minor_axis_length'] / df['major_axis_length']`: \n", + " - This line creates a new column in the DataFrame called `aspect_ratio`. \n", + " - It calculates the aspect ratio by dividing the values in the `'minor_axis_length'` column by the corresponding values in the `'major_axis_length'` column.\n", + "\n", + "2. `filtered_df = df[df['aspect_ratio'] > 2]`: \n", + " - This line filters the DataFrame to create a new DataFrame called `filtered_df`.\n", + " - It includes only the rows where the `aspect_ratio` is greater than 2.\n", + " - The resulting `filtered_df` will therefore contain only those rows from the original DataFrame (`df`) that meet this condition. \n", + "\n", + "In summary, this code calculates an aspect ratio for each row and then filters the DataFrame to include only those rows where the aspect ratio is greater than 2." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "I have this Python code, what is it doing?\n", + "\n", + "df['aspect_ratio'] = df['minor_axis_length'] / df['major_axis_length']\n", + "filtered_df = df[df['aspect_ratio'] > 2]" + ] + }, + { + "cell_type": "markdown", + "id": "a36c3a5f-eb0a-4e66-aead-f18f0e1d45a8", + "metadata": {}, + "source": [ + "## Code documentation\n", + "Assume a data scientist was in a rush and wrote code quickly. Unfortunately, they wrote code without any comments and that's why it is hard to know what the code does. LLMs can help with adding comments to code." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "2a9e7e00-8f4c-4323-a80b-ede608d3341b", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Certainly! Here is the Python code with comments added to explain each step:\n", + "\n", + "```python\n", + "# Read the image from the specified file\n", + "image = imread(\"data/blobs.tif\")\n", + "\n", + "# Apply a Gaussian blur to the image with a sigma of 1 to smooth it\n", + "blurred_image = filters.gaussian(image, sigma=1)\n", + "\n", + "# Calculate the Otsu's threshold value for the blurred image\n", + "threshold = filters.threshold_otsu(blurred_image)\n", + "\n", + "# Create a binary image where pixels with values greater than or equal to the threshold are True\n", + "thresholded_image = blurred_image >= threshold\n", + "\n", + "# Label the connected components in the binary image\n", + "label_image = measure.label(thresholded_image)\n", + "\n", + "# Display the labeled image with the labels shown\n", + "imshow(label_image, labels=True)\n", + "```\n", + "\n", + "This code processes an image by blurring it, applying a threshold to create a binary image, labeling the connected components, and displaying the labeled image." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "I have this Python code. Can you add comments to it?\n", + "\n", + "image = imread(\"data/blobs.tif\")\n", + "blurred_image = filters.gaussian(image, sigma=1)\n", + "threshold = filters.threshold_otsu(blurred_image)\n", + "thresholded_image = blurred_image >= threshold\n", + "label_image = measure.label(thresholded_image)\n", + "imshow(label_image, labels=True)" + ] + }, + { + "cell_type": "markdown", + "id": "9d1ae14d-8738-4c23-bd5d-09b38e554b98", + "metadata": {}, + "source": [ + "## Fixing errors in code\n", + "You can also ask LLMs to assist in fixing code, e.g. if the code above is executed, there is an error message. If we provide the error message to the LLM, it can assist us in solving the problem." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "4507f71e-b491-4dd5-b3e9-9981c643929f", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "The error message you're encountering indicates that the `imread` function is not recognized. This usually means that the necessary library for image reading hasn't been imported or is not installed. The function `imread` is commonly used for reading images and is part of libraries such as OpenCV or `skimage`.\n", + "\n", + "Here’s how you can modify the code to include the appropriate imports with the `skimage` library:\n", + "\n", + "```python\n", + "from skimage.io import imread, imshow\n", + "from skimage import filters, measure\n", + "\n", + "# Read the image file\n", + "image = imread(\"data/blobs.tif\")\n", + "\n", + "# Apply Gaussian blur to the image with a given sigma value\n", + "blurred_image = filters.gaussian(image, sigma=1)\n", + "\n", + "# Compute the Otsu threshold on the blurred image\n", + "threshold = filters.threshold_otsu(blurred_image)\n", + "\n", + "# Create a boolean image where pixels are marked as true if they are above the threshold\n", + "thresholded_image = blurred_image >= threshold\n", + "\n", + "# Label connected regions in the thresholded image\n", + "label_image = measure.label(thresholded_image)\n", + "\n", + "# Display the labeled image\n", + "imshow(label_image, labels=True)\n", + "```\n", + "\n", + "Please ensure you have the `scikit-image` library installed. You can install it via pip if necessary:\n", + "\n", + "```bash\n", + "pip install scikit-image\n", + "```\n", + "\n", + "After running the above code with the necessary imports and ensuring all libraries are installed, it should work without the described error." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "When I execute the code above in Python, there is an error message. Help me making the code work!\n", + "\n", + "---------------------------------------------------------------------------\n", + "NameError Traceback (most recent call last)\n", + "Cell In[1], line 1\n", + "----> 1 image = imread(\"data/blobs.tif\")\n", + " 2 blurred_image = filters.gaussian(image, sigma=1)\n", + " 3 threshold = filters.threshold_otsu(blurred_image)\n", + "\n", + "NameError: name 'imread' is not defined" + ] + }, + { + "cell_type": "markdown", + "id": "e3dc5cf7-d0e0-46a5-a64f-f5f84f79e42f", + "metadata": {}, + "source": [ + "## Exercise\n", + "Run the code above and ask an LLM to assist you in making it work. \n", + "\n", + "Hint: In case of trouble, the code was taken from [here](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/22_feature_extraction/quantitative_measurements.html)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5d9319ad-f0fe-4a90-bea9-743fc75bfc0a", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/50_data_analysis.ipynb b/docs/50_data_analysis.ipynb new file mode 100644 index 0000000..90bcb76 --- /dev/null +++ b/docs/50_data_analysis.ipynb @@ -0,0 +1,215 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "84df19dd-e9ae-403a-8d4c-118aac29486e", + "metadata": {}, + "source": [ + "# Data analysis\n", + "Many LLMs are excellent data scientists. Just a reminder: it is important to specify clearly what code should do." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "0db11be8-4f1d-4581-aef3-3f05491d16e7", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "To iterate over all files in a folder, open each `.tif` image, and display it, you can use the following Python code. This code uses the `os` module to navigate the directory and the `PIL` (Pillow) library to open and display images. Make sure you have the Pillow library installed (`pip install Pillow`) before running this code:\n", + "\n", + "```python\n", + "import os\n", + "from PIL import Image\n", + "\n", + "def show_tif_images_in_folder(folder_path):\n", + " # List all files in the given folder\n", + " for filename in os.listdir(folder_path):\n", + " # Get the full file path\n", + " file_path = os.path.join(folder_path, filename)\n", + " \n", + " # Check if the current file is a .tif file\n", + " if filename.lower().endswith('.tif'):\n", + " try:\n", + " # Open and display the .tif file\n", + " with Image.open(file_path) as img:\n", + " img.show()\n", + " except IOError as e:\n", + " print(f\"Cannot open {file_path}: {e}\")\n", + "\n", + "# Replace 'your_folder_path_here' with the path to your folder containing .tif images\n", + "folder_path = 'your_folder_path_here'\n", + "show_tif_images_in_folder(folder_path)\n", + "```\n", + "\n", + "### Notes:\n", + "1. **Folder Path**: Replace `'your_folder_path_here'` with the actual path to the folder containing your `.tif` images.\n", + "2. **Error Handling**: The code includes error handling to skip files that cannot be opened as images. This prevents the program from crashing if it encounters a corrupted file or a non-readable .tif.\n", + "3. **Case Insensitivity**: The check for the `.tif` extension is case insensitive (`.tif` vs. `.TIF`).\n", + "\n", + "Make sure the folder path is correctly set and validated according to your filesystem." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Give my Python code for iterating over all files in a folder.\n", + "If the file is a .tif image, open it and show it." + ] + }, + { + "cell_type": "markdown", + "id": "a58d7955-7293-4206-88e0-92c62a871472", + "metadata": {}, + "source": [ + "## Chat-interaction for modifying code\n", + "LLM platforms commonly come as chat-bots, and thus, we can refer back to formerly generated code and ask to modify the code according to our needs." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "1f3a0ca4-b6c7-40b6-9b6e-da1fb4f84cbc", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Certainly! Below is a Python script that iterates over all files in a specified folder, checks if each file is a `.tif` image, and opens it for display. This script does not encapsulate the logic within a function:\n", + "\n", + "```python\n", + "import os\n", + "from PIL import Image\n", + "import matplotlib.pyplot as plt\n", + "\n", + "# Specify the directory containing the files\n", + "directory = '/path/to/your/directory'\n", + "\n", + "# Iterate over all files in the specified directory\n", + "for filename in os.listdir(directory):\n", + " # Construct the full file path\n", + " file_path = os.path.join(directory, filename)\n", + " \n", + " # Check if the file is a .tif image\n", + " if os.path.isfile(file_path) and filename.lower().endswith('.tif'):\n", + " # Open the .tif image\n", + " image = Image.open(file_path)\n", + " \n", + " # Display the image using matplotlib\n", + " plt.imshow(image)\n", + " plt.title(f\"Displaying: {filename}\")\n", + " plt.axis('off') # Hide axes\n", + " plt.show()\n", + "\n", + " # Alternatively, you can display using PIL's show method:\n", + " # image.show()\n", + "```\n", + "\n", + "### Instructions:\n", + "\n", + "- Replace `'/path/to/your/directory'` with the actual path of the folder you want to iterate over.\n", + "- Make sure that both the `PIL` (Pillow) library and `matplotlib` are installed in your Python environment. You can install them using `pip install Pillow matplotlib` if needed.\n", + "- The `plt.show()` line will open each `.tif` image one at a time. If you have a large number of images to display, you may want to include a mechanism to close the plot window or process in batches.\n", + "\n", + "This code goes through each file in the specified directory, checks if the file extension is `.tif` (case-insensitive), and displays the image accordingly." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Could you also write code that does not use a function?" + ] + }, + { + "cell_type": "markdown", + "id": "f517c9e6-96ce-4976-86e1-6774e1266e6c", + "metadata": {}, + "source": [ + "## Asking to explain code\n", + "LLMs open the door for non-programmers to write code. It is extremely important to make sure that code does what it is supposed to do. Hence, if we find code which we don't understand, we can ask the LLM to explain it to us." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "33ec9391-8fef-429c-9562-2582d55fcb2c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "In the initial code example, the `try: ... except: ...` block is used to handle any potential errors that may occur when attempting to open and display an image file. Here's a more detailed explanation:\n", + "\n", + "- **`try:` block**: This block contains the code that might raise an exception. In the context of your task, this involves opening an image file using the `Image.open()` function and attempting to display it with the `show()` method. If everything goes well, the code in this block executes without any issues.\n", + "\n", + "- **`except:` block**: If an exception occurs in the `try:` block, control is immediately transferred to the `except:` block. The `except:` block defines what should be done if an error occurs. In the context of opening and showing an image, common exceptions might include file corruption, issues with file permissions, or the file not being a valid image. By catching these exceptions, your code can handle the error gracefully (e.g., by printing an error message) instead of crashing.\n", + "\n", + "Using a `try: ... except: ...` block is a way to make the code more robust and user-friendly, ensuring that errors are handled appropriately without stopping the entire script execution." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "What does the try: ... except: ... do in the first code example?" + ] + }, + { + "cell_type": "markdown", + "id": "a77fc349-e5b5-4fbd-a98a-2ec1612dd452", + "metadata": {}, + "source": [ + "## Exercise\n", + "Ask an LLM / Chatbot to generate Python code for:\n", + "* loading a .csv file of your choice ([or this one]()) as DataFrame\n", + "* compute descriptive statistics of columns, e.g. mean and standard deviation.\n", + "* plot two numeric columns against each other using seaborn.\n", + "\n", + "Optional:\n", + "* Ask the LLM what a DataFrame is.\n", + "* Ask the LLM what seaborn is.\n", + "* Ask the LLM modify the plot, e.g. change the colours of data points." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d3492fb1-98d8-4596-853b-6522d28ce3ac", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/60_bio-image-analysis.ipynb b/docs/60_bio-image-analysis.ipynb new file mode 100644 index 0000000..d0792fd --- /dev/null +++ b/docs/60_bio-image-analysis.ipynb @@ -0,0 +1,398 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "14addf66-dac0-43f1-8414-c8ae74139dff", + "metadata": {}, + "source": [ + "# Analysing image data\n", + "LLMs can also assist you in analysing image data. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "7e8aa79b-7c7f-42ac-bdaf-e546e9e47065", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "To analyze an image programmatically, you can use libraries like OpenCV and PIL (Pillow) in Python. Below is a sample code that demonstrates how to load an image, convert it to grayscale, and detect edges using OpenCV:\n", + "\n", + "```python\n", + "import cv2\n", + "from matplotlib import pyplot as plt\n", + "\n", + "# Load an image from file\n", + "image_path = 'path_to_your_image.jpg'\n", + "image = cv2.imread(image_path)\n", + "\n", + "# Convert the image to grayscale\n", + "gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)\n", + "\n", + "# Use Canny Edge Detection\n", + "edges = cv2.Canny(gray_image, threshold1=100, threshold2=200)\n", + "\n", + "# Display the original image and the edge-detected image\n", + "plt.figure(figsize=(10, 5))\n", + "\n", + "plt.subplot(1, 2, 1)\n", + "plt.title('Original Image')\n", + "plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))\n", + "\n", + "plt.subplot(1, 2, 2)\n", + "plt.title('Edge Detected Image')\n", + "plt.imshow(edges, cmap='gray')\n", + "\n", + "plt.show()\n", + "```\n", + "\n", + "### Explanation:\n", + "- **OpenCV**: A powerful library for computer vision tasks. We use it to load images and perform image processing operations.\n", + "- **cv2.imread()**: Reads an image from a file.\n", + "- **cv2.cvtColor()**: Converts the image from one color space to another, here from BGR to Grayscale.\n", + "- **cv2.Canny()**: Performs edge detection using the Canny algorithm.\n", + "- **matplotlib.pyplot**: Used here for displaying images.\n", + "\n", + "### Requirements:\n", + "- Ensure OpenCV and Matplotlib are installed. You can install them via pip if they are not already installed:\n", + " ```sh\n", + " pip install opencv-python-headless matplotlib\n", + " ```\n", + "\n", + "Replace `'path_to_your_image.jpg'` with the actual path to your image file. This code will display both the original and processed edge-detected images side by side. Adjust the `threshold1` and `threshold2` parameters in `cv2.Canny()` to fine-tune edge detection results based on specific needs or image characteristics." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Write code for analysing an image." + ] + }, + { + "cell_type": "markdown", + "id": "c30564b9-d48b-47fd-ab47-67919f7ef293", + "metadata": {}, + "source": [ + "## Scientific image analysis using deep learning\n", + "It might be important that you specify the desired profession and the goals of the analysis, what the images show and what kind of algorithm you want to use. Again, if you specify the task to weakly, the LLM may generate code that is good for anything or nothing." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "9bceaaf4-bf7e-4639-9355-16ebaeefac7c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "To achieve this task, we'll utilize a deep learning model, such as a U-Net, to perform semantic segmentation of the nuclei in the DAPI-stained image. Following segmentation, we'll extract quantitative metrics like intensity, size, and shape.\n", + "\n", + "Below is an example code to accomplish this task using Python and popular libraries such as TensorFlow/Keras for deep learning, OpenCV for image manipulation, and scikit-image for measurements.\n", + "\n", + "Ensure you have these packages installed in your environment:\n", + "- TensorFlow: `pip install tensorflow`\n", + "- OpenCV: `pip install opencv-python`\n", + "- scikit-image: `pip install scikit-image`\n", + "- NumPy: `pip install numpy`\n", + "\n", + "```python\n", + "import tensorflow as tf\n", + "import numpy as np\n", + "import cv2\n", + "from skimage import measure\n", + "import matplotlib.pyplot as plt\n", + "from tensorflow.keras.models import load_model\n", + "\n", + "# Load the pre-trained U-Net model\n", + "model_path = 'path_to_your_trained_model.h5'\n", + "model = load_model(model_path, compile=False)\n", + "\n", + "# Load and preprocess the input image\n", + "def preprocess_image(image_path):\n", + " image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)\n", + " image = cv2.resize(image, (256, 256)) # Resize to the model's expected input size\n", + " image = image / 255.0 # Normalize the image\n", + " return np.expand_dims(image, axis=[0, -1]) # Add batch and channel dimensions\n", + "\n", + "# Predict and segment the nuclei\n", + "def segment_nuclei(image):\n", + " prediction = model.predict(image)\n", + " segmentation = (prediction[0, :, :, 0] > 0.5).astype(np.uint8) # Threshold prediction\n", + " return segmentation\n", + "\n", + "# Quantify nuclei properties\n", + "def quantify_nuclei_properties(segmentation, original_image):\n", + " labeled_nuclei = measure.label(segmentation)\n", + " properties = measure.regionprops(labeled_nuclei, intensity_image=original_image)\n", + " \n", + " nucleus_data = []\n", + " for prop in properties:\n", + " nucleus_info = {\n", + " 'centroid': prop.centroid,\n", + " 'area': prop.area,\n", + " 'mean_intensity': prop.mean_intensity,\n", + " 'eccentricity': prop.eccentricity,\n", + " 'solidity': prop.solidity\n", + " }\n", + " nucleus_data.append(nucleus_info)\n", + " return nucleus_data\n", + "\n", + "# Main analysis\n", + "def analyze_image(image_path):\n", + " original_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE) # Load the original image for intensity measurements\n", + " image = preprocess_image(image_path)\n", + " segmentation = segment_nuclei(image)\n", + " \n", + " # Upscale segmentation to the original image size for accurate measurement\n", + " segmentation = cv2.resize(segmentation.astype(np.uint8), original_image.shape[::-1])\n", + " \n", + " nucleus_data = quantify_nuclei_properties(segmentation, original_image)\n", + " \n", + " # Visualize the results\n", + " plt.figure(figsize=(10, 5))\n", + " plt.subplot(1, 2, 1)\n", + " plt.imshow(original_image, cmap='gray')\n", + " plt.title('Original Image')\n", + " \n", + " plt.subplot(1, 2, 2)\n", + " plt.imshow(segmentation, cmap='gray')\n", + " plt.title('Nuclei Segmentation')\n", + " \n", + " plt.show()\n", + " \n", + " return nucleus_data\n", + "\n", + "# Example usage\n", + "image_path = 'path_to_your_dapi_image.png'\n", + "nuclei_properties = analyze_image(image_path)\n", + "print(nuclei_properties)\n", + "```\n", + "\n", + "### Explanation:\n", + "- **Preprocessing:** We load the image in grayscale and normalize its pixel values to between 0 and 1.\n", + "- **Deep Learning Prediction:** The image is passed through a U-Net model, which generates a binary mask (segmentation) of the nuclei.\n", + "- **Labeling and Properties Extraction:** The `skimage.measure.label` function is used to label connected components in the segmentation mask. We then use `regionprops` to calculate properties such as area, mean intensity, eccentricity, and solidity for each labeled nucleus.\n", + "- **Visualization and Results:** The original and segmented images are displayed, and a list of dictionaries containing the extracted properties for each nucleus is returned and printed.\n", + "\n", + "Make sure to replace the placeholders (`path_to_your_trained_model.h5`, `path_to_your_dapi_image.png`) with actual paths to your model and image files. You might need to adjust the image resizing steps depending on your model's expected input size." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "You are a bio-image analyst.\n", + "You are about to analyse an image taken with a \n", + "fluorescence microscope that shows nuclei in DAPI staining.\n", + "You should use Deep Learning to segment the nuclei in the image.\n", + "You should quantify intensity, size and shape of the segmented nuclei." + ] + }, + { + "cell_type": "markdown", + "id": "03b352b9-15ab-49be-8734-b5c196258b8f", + "metadata": {}, + "source": [ + "## Modifying code style\n", + "As shown earlier, you can ask the LLM to change the style of the code." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "7c6cd48b-9656-44da-9ea3-7f42d3269695", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "To segment and analyze nuclei in a DAPI-stained fluorescence microscope image using deep learning without defining functions explicitly, you can use a pre-trained deep learning model like U-Net for segmentation and then quantify intensity, size, and shape using libraries such as NumPy and OpenCV. Here's a concise Python script using PyTorch and OpenCV:\n", + "\n", + "```python\n", + "import torch\n", + "import numpy as np\n", + "import cv2\n", + "from torchvision import transforms\n", + "from PIL import Image\n", + "from skimage.measure import label, regionprops\n", + "\n", + "# Load image and pre-trained U-Net model\n", + "image_path = 'path_to_image.png'\n", + "model_path = 'path_to_model.pth'\n", + "image = Image.open(image_path).convert('L')\n", + "model = torch.load(model_path)\n", + "model.eval()\n", + "\n", + "# Preprocess the image\n", + "transform = transforms.Compose([\n", + " transforms.ToTensor(),\n", + " transforms.Resize((256, 256)), # Resize to match the model input if necessary\n", + " transforms.Normalize(mean=[0.5], std=[0.5])\n", + "])\n", + "\n", + "input_image = transform(image).unsqueeze(0)\n", + "\n", + "# Segment the nuclei\n", + "with torch.no_grad():\n", + " output = model(input_image)\n", + " segmented = output.squeeze().numpy() > 0.5 # Threshold the output\n", + "\n", + "# Convert segmentation to binary image\n", + "segmented_img = segmented.astype(np.uint8) * 255\n", + "\n", + "# Label and analyze properties of segmented nuclei\n", + "labeled_img = label(segmented_img)\n", + "props = regionprops(labeled_img, intensity_image=np.array(image))\n", + "\n", + "# Quantify attributes\n", + "for prop in props:\n", + " intensity = prop.mean_intensity\n", + " size = prop.area\n", + " eccentricity = prop.eccentricity\n", + "\n", + " print(f'Intensity: {intensity}, Size: {size}, Eccentricity: {eccentricity}')\n", + "```\n", + "\n", + "**Note**: This example assumes that you have a pre-trained U-Net model saved as a `.pth` file and an input image. You may need to adjust the transforms, model path, and image path appropriately. \n", + "\n", + "**Additional Considerations**:\n", + "- Ensure that the input dimensions match the model’s expected input dimensions (resize if necessary).\n", + "- If using a different framework (e.g., TensorFlow), adjust the loading and inference code accordingly.\n", + "- Pre-trained models for segmentation can often be found on platforms like PyTorch Hub or model zoos. These models should be compatible with your analytical goals." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Make the code as short as possible and do not use functions." + ] + }, + { + "cell_type": "markdown", + "id": "6cabf074-4ccf-4caa-b718-deb0bd997ba6", + "metadata": {}, + "source": [ + "## Technical terms\n", + "If you know basics of image analysis and deep learning, you can guide the LLM in generating the code even closer. It needs this guidance, because it cannot see the image and it does not have expertise such as which algorithm to use (e.g. [StarDist](https://github.com/stardist/stardist)) for which kind of image. This may change obviously as LLMs are an active research field." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "18a8a85f-8736-41c7-8e65-69508f05d259", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "Certainly! Below is a concise code snippet that uses the StarDist algorithm to segment nuclei in an image, then quantifies their intensity, size, and shape. This example assumes you have the necessary libraries installed and a pre-trained model available for use:\n", + "\n", + "```python\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from stardist.models import StarDist2D\n", + "from skimage import io, measure\n", + "\n", + "# Load the pre-trained StarDist model\n", + "model = StarDist2D.from_pretrained('2D_versatile_fluo')\n", + "\n", + "# Read the image with DAPI staining\n", + "image = io.imread('path/to/your/image.tif')\n", + "\n", + "# Segment nuclei using StarDist\n", + "labels, _ = model.predict_instances(image)\n", + "\n", + "# Quantify intensity, size, and shape\n", + "properties = measure.regionprops(labels, intensity_image=image)\n", + "\n", + "# Extract and print properties\n", + "for prop in properties:\n", + " print(f\"Nucleus {prop.label}:\")\n", + " print(f\" Intensity: {prop.mean_intensity}\")\n", + " print(f\" Size (area): {prop.area}\")\n", + " print(f\" Eccentricity (shape): {prop.eccentricity}\")\n", + "\n", + "# Optionally, display the segmented nuclei\n", + "plt.imshow(labels, cmap='jet', alpha=0.5)\n", + "plt.imshow(image, cmap='gray')\n", + "plt.axis('off')\n", + "plt.show()\n", + "```\n", + "This script does the following:\n", + "- Loads a pre-trained StarDist model for 2D fluorescence microscopy images.\n", + "- Reads the input image.\n", + "- Segments the nuclei using the StarDist model.\n", + "- Computes properties of each segmented nucleus, such as mean intensity, area (size), and eccentricity (shape), using `skimage.measure.regionprops`.\n", + "- Outputs these properties to the console.\n", + "- Optionally visualizes the segmentation overlaid on the original image.\n", + "\n", + "Ensure to replace `'path/to/your/image.tif'` with the actual path to your image file. Additionally, the script assumes you have installed the `stardist`, `numpy`, `matplotlib`, and `scikit-image` libraries. You can install them using `pip` if needed." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "Use the StarDist algorithm for segmenting the nuclei." + ] + }, + { + "cell_type": "markdown", + "id": "245f1a50-6037-4859-b7b9-46f46732f2d3", + "metadata": {}, + "source": [ + "## Exercise: basic\n", + "\n", + "Load an image of your choice with nuclei staining ([example](https://github.com/haesleinhuepf/BioImageAnalysisNotebooks/blob/main/data/nuclei.tif)). Ask the LLM to use Otsu-Thresholding and Connected-Component Labeling for segmenting the nuclei. Make the code work!\n", + "\n", + "## Exercise: advanced\n", + "Load an image of your choice showing cells with membrane staining ([example](https://github.com/haesleinhuepf/BioImageAnalysisNotebooks/blob/main/data/membranes_2d.tif)) and ask the LLM to generate code for using [CellPose](https://github.com/MouseLand/cellpose) for segmenting the cells. Follow the installation instructions." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bed19665-0251-41c8-8305-8675162aedbd", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "LLM Kernel (gpt-4o-2024-08-06)", + "language": "prompt", + "name": "llm-kernel" + }, + "language_info": { + "codemirror_mode": { + "name": "prompt" + }, + "mimetype": "text/x-prompt", + "name": "prompt" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/_config.yml b/docs/_config.yml new file mode 100644 index 0000000..8233497 --- /dev/null +++ b/docs/_config.yml @@ -0,0 +1,35 @@ +title: Prompt Engineering Basics +author: Robert Haase +logo: logo.png + +# Execute notebooks if the have one missing output +# could be "force" or "auto" or "off" +# See https://jupyterbook.org/content/execute.html +execute: + execute_notebooks: off + +# Define the name of the latex output file for PDF builds +latex: + latex_documents: + targetname: book.tex + +# Information about where the book exists on the web +repository: + url: https://github.com/ScaDS/generative-ai-notebooks + path_to_book: . # Optional path to your book, relative to the repository root + branch: main # Which branch of the repository should be used when creating links (optional) + +# Add GitHub buttons to your book +# See https://jupyterbook.org/customize/config.html#add-a-link-to-your-repository +html: + use_issues_button: true + use_repository_button: true + extra_footer: | +

+ Copyright: Licensed CC-BY 4.0 unless mentioned otherwise. + Contributions and feedback are welcome. +

+sphinx: + config: + html_show_copyright: false + html_last_updated_fmt: '%Y-%m-%d' diff --git a/docs/_toc.yml b/docs/_toc.yml new file mode 100644 index 0000000..7c442ba --- /dev/null +++ b/docs/_toc.yml @@ -0,0 +1,19 @@ +# Table of contents +# Learn more at https://jupyterbook.org/customize/toc.html + +format: jb-book +root: intro + +parts: + - caption: Exercises + chapters: + - file: 10_prompting.ipynb + - file: 20_use_cases.ipynb + - file: 30_prompt_engineering.ipynb + - file: 40_code_generation.ipynb + - file: 50_data_analysis.ipynb + - file: 60_bio-image-analysis.ipynb + + - caption: Links + chapters: + - file: imprint diff --git a/docs/imprint.md b/docs/imprint.md new file mode 100644 index 0000000..aef5744 --- /dev/null +++ b/docs/imprint.md @@ -0,0 +1,39 @@ + +# Imprint + +## Impressum +### Angaben gemäß § 5 TMG + +Dr. rer. medic. Robert Haase\ + + +### Kontakt +Telefon: +49 341 97 39309\ +E-Mail: robert.haase (at] uni-leipzig.de + +### Verantwortlich für den Inhalt nach § 55 Abs. 2 RStV: +Dr. rer. medic. Robert Haase\ +ScaDS.AI Dresden / Leipzig\ +Universität Leipzig\ +Humboldstraße 25\ +04105 Leipzig + +# Haftungsausschluss + +## Haftung für Inhalte + +Die Inhalte unserer Seiten wurden mit größter Sorgfalt erstellt. Für die Richtigkeit, Vollständigkeit und Aktualität der Inhalte können wir jedoch keine Gewähr übernehmen. Als Diensteanbieter sind wir gemäß § 7 Abs.1 TMG für eigene Inhalte auf diesen Seiten nach den allgemeinen Gesetzen verantwortlich. Nach §§ 8 bis 10 TMG sind wir als Diensteanbieter jedoch nicht verpflichtet, übermittelte oder gespeicherte fremde Informationen zu überwachen oder nach Umständen zu forschen, die auf eine rechtswidrige Tätigkeit hinweisen. Verpflichtungen zur Entfernung oder Sperrung der Nutzung von Informationen nach den allgemeinen Gesetzen bleiben hiervon unberührt. Eine diesbezügliche Haftung ist jedoch erst ab dem Zeitpunkt der Kenntnis einer konkreten Rechtsverletzung möglich. Bei Bekanntwerden von entsprechenden Rechtsverletzungen werden wir diese Inhalte umgehend entfernen. + +## Haftung für Links + +Unser Angebot enthält Links zu externen Webseiten Dritter, auf deren Inhalte wir keinen Einfluss haben. Deshalb können wir für diese fremden Inhalte auch keine Gewähr übernehmen. Für die Inhalte der verlinkten Seiten ist stets der jeweilige Anbieter oder Betreiber der Seiten verantwortlich. Die verlinkten Seiten wurden zum Zeitpunkt der Verlinkung auf mögliche Rechtsverstöße überprüft. Rechtswidrige Inhalte waren zum Zeitpunkt der Verlinkung nicht erkennbar. Eine permanente inhaltliche Kontrolle der verlinkten Seiten ist jedoch ohne konkrete Anhaltspunkte einer Rechtsverletzung nicht zumutbar. Bei Bekanntwerden von Rechtsverletzungen werden wir derartige Links umgehend entfernen. + +## Urheberrecht + +Die durch die Seitenbetreiber erstellten Inhalte und Werke auf diesen Seiten unterliegen dem deutschen Urheberrecht. Die Vervielfältigung, Bearbeitung, Verbreitung und jede Art der Verwertung außerhalb der Grenzen des Urheberrechtes sind gestattet unter Einhaltung der Lizenzbedingungen der [Creative Commons CC-BY 4.0 Lizenz](https://creativecommons.org/licenses/by/4.0/). Soweit die Inhalte auf dieser Seite nicht vom Betreiber erstellt wurden, werden die Urheberrechte Dritter beachtet. Insbesondere werden Inhalte Dritter als solche gekennzeichnet. Sollten Sie trotzdem auf eine Urheberrechtsverletzung aufmerksam werden, bitten wir um einen entsprechenden Hinweis. Bei Bekanntwerden von Rechtsverletzungen werden wir derartige Inhalte umgehend entfernen. + +## Datenschutz + +Die Nutzung unserer Webseite ist in der Regel ohne Angabe personenbezogener Daten möglich. Soweit auf unseren Seiten personenbezogene Daten (beispielsweise Name, Anschrift oder eMail-Adressen) erhoben werden, erfolgt dies, soweit möglich, stets auf freiwilliger Basis. Diese Daten werden ohne Ihre ausdrückliche Zustimmung nicht an Dritte weitergegeben. +Wir weisen darauf hin, dass die Datenübertragung im Internet (z.B. bei der Kommunikation per E-Mail) Sicherheitslücken aufweisen kann. Ein lückenloser Schutz der Daten vor dem Zugriff durch Dritte ist nicht möglich. +Der Nutzung von im Rahmen der Impressumspflicht veröffentlichten Kontaktdaten durch Dritte zur Übersendung von nicht ausdrücklich angeforderter Werbung und Informationsmaterialien wird hiermit ausdrücklich widersprochen. Die Betreiber der Seiten behalten sich ausdrücklich rechtliche Schritte im Falle der unverlangten Zusendung von Werbeinformationen, etwa durch Spam-Mails, vor. diff --git a/docs/intro.md b/docs/intro.md new file mode 100644 index 0000000..a477052 --- /dev/null +++ b/docs/intro.md @@ -0,0 +1,22 @@ +# Prompt Engineering Basics + +This page is a collection of Jupyter Notebooks about Prompt Engineering. It aims at people without programming skills. + +Contributions and feedback are very welcome! In case you see room for improvement, please [create a github issue](https://github.com/ScaDS/prompt-engineering-basics-2024/issues) and/or consider [contributing](https://github.com/ScaDS/Generative-AI-notebooks/blob/main/CONTRIBUTING.md). + +## How to use this resource + +For going through these materials, it is not necessary to install anything in your computer. You can follow along the exercises using online services such as + +* [OpenAI's ChatGPT](https://chat.openai.com/) +* [Anthropic's Claude](https://claude.ai/) +* [Google's Gemini](https://gemini.google.com/app) +* [Perplexity AI](https://you.com/) +* [You.com](https://you.com/) +... any many others. Depending on which service provider you use, you may need to sign up and probably pay for a subscription. + +In some of the exercises, we will generate data analysis Python code. For executing this code, it is recommended to install [mini-conda](https://docs.anaconda.com/miniconda/) and setup an environment for Python programming nd bio-image analysis as explained on [this page](https://haesleinhuepf.github.io/BioImageAnalysisNotebooks/01_introduction/readme.html). You can then also execute the shown code in Python Juptyer notebooks. + +## Acknowledgements + +We acknowledge the financial support by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research „Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig“, project identification number: ScaDS.AI diff --git a/docs/logo.png b/docs/logo.png new file mode 100644 index 0000000..4f8a469 Binary files /dev/null and b/docs/logo.png differ diff --git a/readme.md b/readme.md new file mode 100644 index 0000000..ae89a60 --- /dev/null +++ b/readme.md @@ -0,0 +1,31 @@ +# Prompt Engineering Basics + +This repository contains a collection of Jupyter notebooks explaining prompt engineering + +https://scads.github.io/prompt-engineering-basics-2024 + +It is maintained using [Jupyter lab](https://jupyterlab.readthedocs.io/en/stable/) and build using [Jupyter book](https://jupyterbook.org/intro.html). + +To edit this book, install depencencies like this: + +``` +pip install jupyterlab +pip install jupyter-book +pip install jupyterlab-spellchecker + +git clone https://github.com/scads/prompt-engineering-basics-2024 +cd prompt-engineering-basics-2024 +jupyter lab +``` + +For modifying the LLM-notebooks, make sure [ipy-llm-kernel](https://github.com/haesleinhuepf/ipy-llm-kernel?tab=readme-ov-file#installation) is properly installed. + +To build the book, you can run this from the same folder: +``` +cd docs +jupyterbook build . +``` + +## Acknowledgements + +We acknowledge the financial support by the Federal Ministry of Education and Research of Germany and by Sächsische Staatsministerium für Wissenschaft, Kultur und Tourismus in the programme Center of Excellence for AI-research „Center for Scalable Data Analytics and Artificial Intelligence Dresden/Leipzig“, project identification number: ScaDS.AI diff --git a/requirements.txt b/requirements.txt new file mode 100644 index 0000000..90f9931 --- /dev/null +++ b/requirements.txt @@ -0,0 +1,5 @@ +jupyter-book>=0.7.0b +sphinx-book-theme<1.0.0 +matplotlib +numpy +ipy-llm-kernel \ No newline at end of file