-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Endava] Fraud Detection API case study #53
Comments
@andreisucala |
Thank you for coming back. Yes, we are very much interested in presenting. Is there a format that we need to follow for the presentation? Any parts of the submission you want to discuss in more detail? Can you recommend some times and who should be the attendees? |
@andreisucala - Are you available to join the Standards WG this Thursday @ 08:00am PT / 04:00pm BST @Henry-WattTime - Is there any specific part of the submission you'd like to discuss in more detail |
@seanmcilroy29 - Yes, sure. Feel free to send the invite! I'll forward it internally to the other members of the team. |
@seanmcilroy29 we updated the submission with the following:
As you are aware COP28 is around the corner and we would like to announce our case-study publication in GSF. Would you know if/when our case study could be published so that we can run a campaign around it internally? |
Fraud Detection API case study
Overview
We analyzed a fraud detection tool, deployed and maintained by our organization, which scans online transactions and gives it a risk score. The platform is used by multiple payments processors from all over the world.
Architecture for the system under consideration
Fig 1. Technology Deployment
Fig. 2 Infrastructure
Technical details of the components in the architecture
The application written in Java is deployed on EC2 instances (AWS). It uses the following tools deployed on the instances:
The application contains an API (called EventAPI) that receives the events and puts them in the Kafka queue. A UI is deployed on Nginx and 2 other components that read and write the events on and from the data stores.
The application infrastructure is deployed in 3 availability zones in eu-central-1 and 1 node reader in a eu-west-1. The nodes are deployed on multiple tiers each containing 1 node per AZ (3 nodes per tier).
MongoDB is deployed cross regions, with half of the cluster being deployed in eu-west-1 (read only mode), for resiliency purposes.
In Production there are a total of 39 nodes distributed per tiers as follows:
Table 1. Instances per tier
Note: The MongoDB deployment is duplicated in the secondary region.
Table 2. Number of instance types
Infrastructure summary
Resource usage
FIg. 3 Application Tier Memory Usage
Fig. 4 Application Tier CPU Usage
Fig. 5 CPU Usage Primary Region
Fig. 5 CPU Usage Secondary Region
Procedure
(What) Software boundary
The components included in the software boundary are:
For these components we considered CPU and Memory to have the biggest impact, which is also supported by their corresponding cost. We also included the SCI calculation for storage so that we have a more comprehensive value.
We excluded the following factors based on the reasoning that their impact is less significant or too difficult to quantify:
(Scale) Functional unit
The functional unit chosen for considering the scaling of the application is 1 day, more specifically the API calls made over 1 day.
Over a 1 hour period there are 72,000 API calls, so for the span of 24 hours we have 1,728,000 API calls.
(How) Quantification method
Energy (E)
CPU & Memory
For Energy we used an API-based approach, choosing the Climatiq.io service to calculate values for each VM instance, based on CPU usage, instance type, location and duration of usage.
We added up the values from the response fields
memory_estimate
andcpu_estimate
to get the total Energy used.Request body
Response body
Based on the Activity Value for CPU and Memory taken from the response above, we have the following Energy values:
Total CPU & Memory energy consumption is:
Storage
We used the same API-based approach for storage.
Request body
Response body
Storage energy consumption is:
Energy Carbon Intensity (I)
CPU, Memory & Storage
For Carbon Intensity, we took our data from Electricity Maps with the following parameters:
Germany was chosen as the location due to our AWS Region being
eu_central_1
which, as per this document (Regions and Zones), is located in Frankfurt.For the year we took the latest available data provided by Electricity Maps.
When considering future directions and initiatives, we take into account the fact that Electricity Maps offers forecasts for carbon intensity and power consumption, which means that it would be possible to forecast the carbon intensity of a given infrastructure for different locations in order determine the most suitable data center in terms of emissions.
Another important aspect to note is the different years the E and I values belong to. While ElectricityMaps offers the Carbon Intensity (I) data from as recent as 2022, the Climatiq API call that provides the Energy (E) data makes use of a calculation method named
ar4
, which is documented in CO2e - Methods of Calculation as referencing emissions from the year 2007.With the disclaimers above, we consider the I variable as:
Embodied (M)
We calculated the embodied carbon of the software/hardware components of our cloud application by using a manual approach and the following equation provided by SCI Guidance:
M = TE * (TR/EL) * (RR/TR)
TE = Total Embodied Emissions, the sum of Life Cycle Assessment(LCA) emissions for all hardware components
TR = Time Reserved, the length of time the hardware is reserved for use by the software
EL = Expected Lifespan, the anticipated time that the equipment will be installed
RR = Resources Reserved, the number of resources reserved for use by the software
TR = Total Resources, the total number of resources available
CPU & Memory
Sources
TE = data from Cloud Carbon Footprint - Embodied Emissions constants
TR = number of hours used for Climatiq's
/compute
callEL = we started with the number of warranty years provided by hardware manufacturers, which is 3 years, and increased the period to 5 years upon recommendation from the GSF working group, based on what is assumed to be a more likely duration
and the list of manufacturers is comprised of AMD for the t3a instances and Intel for all the rest
RR = number of vCPUs for a given instance
TR = the largest instance vCPUs for the given instance family
Values
Storage
Sources
TE = data from a Cornell University paper on The Dirty Secret of SSDs: Embodied Carbon
TR = number of hours used for Climatiq's
/storage
callEL = we used the same number of years (5 years) as for CPU & Memory calculation
RR = number of reserved terabytes for the ESB instance
TR = number of reserved terabytes for the ESB instance
For both RR and TR we considered that we are responsible for the entire EBS storage that was reserved, even though only 930 GB are in use.
Values
(Quantify) SCI Value Calculation
CPU & Memory
For the given formula
SCI = (E * I) + M per R
we have the following values:E = 23.5 kWh
I = 473 gCO₂eq/kWh
M = 5641 gCO₂eq
R = 1,728,000 API calls in 1 day
This translates in 0.69 kgCO2e for a 1h period, with a 16% CPU average utilization.
Storage
For the given formula
SCI = (E * I) + M per R
we have the following values:E = 0.2615 kWh
I = 473 gCO₂eq/kWh
M = 701 gCO₂eq
R = 1,728,000 API calls in 1 day
Total
Conclusions
On infrastructure and scaling
It's worth noting that the current system is overprovisioned to ensure smooth running during usage spikes. We believe that the system could take double the load of current API calls without causing a significant impact in the emission factors. Essentially what this means is that the scaling of the application will not cause a linear increase in the carbon emissions.
In addition, the Embodied Emission variable (M) represents the emissions for the creation and disposal of a hardware unit. These emissions will normally not increase during the total lifespan of a hardware device, which means that the longer the period of time the device is used, the lower the M variable will be and consequently the SCI score. In practice, the hardware units that are used in this infrastructure (AMD and Intel microprocessors such as 2.5 GHz AMD EPYC 7000 Series Processor) are likely to be used for more than the estimated 3 years warranty we stated. This can can be seen by doubling the Estimated Lifetime (EL) for the calculation of M, from 5 years to 10 years, which results in the value being lowered from 16.7 kg to 2.82 kg, a significant drop.
Alternative data sources
As a possible alternative method for the obtaining CPU & Memory
M
, the embodied emissions we received from Climatiq in the/compute
call,embodied_cpu_estimate
field are:Adding up all the instance values would result in the following
M
for the total architecture:M (alternative) = 3,44 kgCO2e
This value is significantly different than the one we obtained through manual calculation, which is M = 9403 gCO₂eq. We are hoping to investigate this further as to why the noticeable difference, in future PoC's.
Notes
spreadsheet, which is in turn referenced in the Building an AWS EC2 Carbon Emissions Dataset article from Medium.org
embodied_cpu_estimate
field, which implies that only CPU, and not Memory, is taken into accountemission_factor
field. It is documented in How Climatiq handles data quality that the emission factor is omitted when it is deemed to be unusable. However, we are not aware of the implications of this and we do not know what, if any, other emission factor is considered as a default optionOverall, while there is a breadth of data for part of the variables used in the SCI calculation, an alignment in terminology and clearer documentation would be immensely valuable in making uses cases more approachable and increasing the degree of confidence in the calculation resulted.
The text was updated successfully, but these errors were encountered: