-
Notifications
You must be signed in to change notification settings - Fork 1
Home
AutoCite is a python application that enables one to automatically create citations for websites in either APA or Chicago formats. With AutoCite, one can focus less on citations and more on actually writing a paper.
For a quick start, just head over to Autocite.zapto.org to try it out :)
There are 4 main ways to use AutoCite
- AutoCite Command Line [AutoCite_CLI.py]
- AutoCite Graphical User Interface [AutoCite_GUI.pyw / AutoCite_Win_xxxxxx.exe / AutoCite_Linux_xxxxxx]
- AutoCite Private Server [AutoCite_Web]
- AutoCite on AWS [Autocite.zapto.org / AutoCite_lambda]
AutoCite Command Line is a python application with just the bare essentials.
This is good if you want to use AutoCite as a component of another project or if you just really like command line applications.
USAGE: AutoCite_CLI.py URL FORMAT Possible Formats: Chicago (default) apa Notes: Ensure that URL begins with either "http://" or "https://"
> python AutoCite_CLI.py https://google.com apa Citing https://google.com... Citation format set as apa Google (n.d). Retrieved from https://google.com
- Python 3.7+
- Beautiful Soup 4 (Python Module)
- Python Date Utility (Python Module)
pip3 install bs4 python-dateutil
AutoCite_GUI is the fastest and most direct way to create citations as the code is run directly on your own machine (unlike AutoCite on AWS). Simply download the executable for your operating system and run it!
Use this if you want quick citations and dislike having to keep visit the online page.
Install the same dependencies as for AutoCite Command Line to run the python version of the GUI.
AutoCite Private Server is a flask webserver that serves a webpage where one can use AutoCite from. This is useful in the context of creating a home/company/school wide solution for citation needs.
Use this if you are a system admin or something
- Python 3.7+
- Beautiful Soup 4 (Python Module)
- Python Date Utility (Python Module)
- Flask (Python Module)
- Gunicorn3 (Linux Package)
pip3 install bs4 python-dateutil flask sudo apt-get install gunicorn3
git clone https://github.com/BrandonTang89/AutoCite.git cd AutoCite_Web chmod +x run_deployment_server.sh ./run_deployment_server.sh
AutoCite on AWS is the primary way that most people will have access to AutoCite. Static site hosting on AWS Simple Storage Service (S3) and AWS CloudFront caching is used to serve the webpage. AWS lambda is use for the main citation work. AWS RDS is used as a cache for the citations to reduce latency and reduce AWS lambda requests.
Autocite on AWS can be broken down into 2 different sections,
- Retrieving the static website HTML document of the site
- Creating the citations upon the "generate citations" button being clicked
When the user sends a get request to autocite.zapto.org or autocite.info.tm, the request will be rerouted to "https://d2chtxlgatshjb.cloudfront.net/", this is the domain of the cloudfront distribution of the S3 bucket configured to for static website hosting. From there, a cached version of the HTML document will be served to the user.
Cloudfront is used for 2 main reasons
- Transport layer security (TLS) by securing the communication with an SSL certificate
- Reduced latency as the HTML document is cached at edge servers by AWS
When the generate citations is clicked, the entire batch of citations are sent via a post request to AWS API Gateway which sends a trigger to AWS lambda to connect to the database to check for cached citations.
The database is organised as such
database | --- autocite_cache (schema) | --- Records (table) | | --- hash | --- citation
Where "hash" is formed by performing a SHA1 hash on a concatenated string of the raw URL and the citation format (APA or Chicago). This is done to ensure that both the URL and format are used to define a unique citation. Furthermore, the hashing of any input to the SQL query helps to protect against SQL injection attacks.
Citation is a column that stores the citation of the URL and citation format used to form the hash. It is stored without the date accessed in the case of a Chicago formatted citation. This is because the citation may be retrieved at a different date than it is cached.
Citations that were not found in cached are sent to another AWS Lambda function to be cited asynchronously. A separate lambda request is sent for each citation. This greatly increases the speed of the citations being made as they are done in parallel.
When all the citations have arrived for the user, they are displayed to the user in the output box. In the background, a cache update request is sent. Sending a separate request for cache update as opposed to doing it in the lambda function to create new citations improves speed as only one connection to the database is need for the update. Furthermore, this ensures that the database is not overflowed with connections, preventing a denial of service attack.