Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert available funds / contributions/expenditures from area to step charts #216

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

hancush
Copy link
Member

@hancush hancush commented Oct 1, 2024

Overview

This PR updates the charts on candidate and committee pages. Namely it:

It also updates the aggregate queries to force a continuous interval of aggregate data. Previously, it was sparse, which posed a challenge for charting. See #216 (comment)

Demo

charts

Testing instructions

Update your local Docker Compose file to connect to a read-only version of the prod database (see Money Trail NM - Docker Env - Readonly Prod DB note in Bitwarden), then view some donation / expenditure charts and confirm they look good.

@hancush hancush temporarily deployed to openness-pro-hcg-revise-xr6h82 October 1, 2024 17:04 Inactive
@hancush hancush temporarily deployed to openness-pro-hcg-revise-xr6h82 October 1, 2024 17:08 Inactive
@hancush hancush requested a review from fgregg October 1, 2024 17:45
@fgregg
Copy link
Member

fgregg commented Oct 1, 2024

chart style

i think it's hacked into columns, because we were trying to evoke a bar chart, and not an area chart. If it was a true area chart, then the area under the curve should equal the total money. If you divided the monthly money by the number of days in the month, then you would have an approximation of that.

In distinction, with a barchart, you just pay attention to how high the whole number reaches. Why we didn't use a barchart, directly, I don't quite remember.

Even if we did want to use an area chart, i think it would still make sense to have something stair-step like this since we don't know when in a month the donation happened.

IMO, the best alternatives would be

  • A barchart
  • A stairstep area chart, with monthly amounts normalized by days in the month
  • A starstep line chart.

Of these, the line chart is probably the simplest to do.

Why data isn't loading?

I don't know. The aggregate_data.py file is responsible for making the view that power those. We could pair on that if you'd like.

@hancush
Copy link
Member Author

hancush commented Oct 1, 2024

@fgregg Thanks, got it. Updated to a stepped line chart.

step-chart

For the data issue, I resolved it by:

  • Recreating the materialized views (seems like we didn't do this after bringing in the last change to the aggregate queries in 27a86ba, which updated the contribution transaction types to capture)
  • Updating "Monetary contribution" to "Monetary Contribution" in the TransactionType table (the aggregate query looks for the specific capitalization we control, but we haven't reimported pre-2023 data since we made that change, so those transactions had the old description and weren't being captured)

This is ready for another look.

Copy link
Member

@fgregg fgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code looks quite clean, nice job! some minor, optional suggestions.

where was this change made?

Updating "Monetary contribution" to "Monetary Contribution" in the TransactionType table (the aggregate query looks for the specific capitalization we control, but we haven't reimported pre-2023 data since we made that change, so those transactions had the old description and weren't being captured)

Also, since we are switching to line charts here, it might be good to switch to line charts for the cash on hand and debt at present charts above this chart?

) months
LEFT JOIN (
SELECT
{table}.amount AS amount,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about a COALESCE(table.amount, 0) as amount here, and then you can simplify some of the following python code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional suggestion, tbc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added up above!

months.month,
{table}.amount
FROM (
SELECT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional suggestion: this looks like a good place to use a CTE.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CTEs aren't supported in the version of Postgres we're using for local development (which might not be a constraint in deployment, actually, but don't feel like upgrading Postgres here).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should have the development env match up, so I can open an issue for that. fine not to change that here.

@hancush hancush changed the title Revise contribution/expenditure chart as a true area chart Revise contribution/expenditure chart as step chart Oct 2, 2024
@hancush
Copy link
Member Author

hancush commented Oct 2, 2024

charts

@fgregg Good idea to update the funds available, since these are point in time numbers, not an accumulating value. I revised that chart, and added axis labels and forced a consistent date range / x-axis for both, since the first chart should be a function of the second, i.e., they're connected and should be easy to compare?

Comment on lines -1039 to -1061
if summed_filings[0].opening_balance:
first_opening_balance = summed_filings[0].opening_balance
else:
first_opening_balance = 0

if summed_filings[0].debt_carried_forward:
first_debt = summed_filings[0].debt_carried_forward
else:
first_debt = 0

init_date = summed_filings[0].initial_date

first_initial_date = [int(since), 1, 1]

# If the first available filing date is on or before the start date
# passed into this method, use that as the first date in the trendline
if init_date:
init_date_parts = [init_date.year, init_date.month, init_date.day]
if datetime(*init_date_parts) <= datetime(*first_initial_date):
first_initial_date = init_date_parts

debt_trend.insert(0, [first_debt, *first_initial_date])
balance_trend.insert(0, [first_opening_balance, *first_initial_date])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure about doing this inference, since the other values are as-of the filing date, and we don't have the date for the filing before the first one in our database. i omitted it, but open to your thoughts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's fine to omit.

@@ -83,10 +83,10 @@ <h4>
{% endblock %}

{% block charts %}
<h3><i class="fa fa-fw fa-area-chart"></i> Net funds over time</h3>
<h3><i class="fa fa-fw fa-area-chart"></i> Funds available and debts by filing date</h3>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Net is a difference in values. We're showing both values.

@hancush hancush requested a review from fgregg October 2, 2024 16:54
@hancush hancush changed the title Revise contribution/expenditure chart as step chart Convert available funds / contributions/expenditures from area to step charts Oct 2, 2024
Copy link
Member

@fgregg fgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job lining up the data ranges!

i think it's okay for the available fund chart to be a normal line chart and not a step chart. If you have $100 dollars on july 1, and $200 dollars on august 1, it's reasonable to think that you had $150 at some point in between. But, it's up to you.

FROM (
SELECT
DISTINCT DATE_PART('year', month) AS year,
GENERATE_SERIES(1, 12) AS month
FROM {table}_by_month
ORDER BY year, month
) months
LEFT JOIN (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait if we don't need to do LEFT JOIN here, then that suggests that every month we need is already in the generated table, in which case why do we need this subquery at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't need that left join here, then we should probably move the coalese logic up to the aggregate functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be awesome, because right now, the aggregate tables are sparse, so we can't distinguish between when there were $0 in transactions or we don't have data. That's why I did away with this left join - for the months on either end of the range, it reported $0 in transactions, but that might not actually be accurate.

@hancush
Copy link
Member Author

hancush commented Oct 2, 2024

Ok, @fgregg, last revision is in. Updated the funds chart to use end of filing period, adding the filing description/s to the tooltip, and fixed tooltip formatting on both charts.

charts

Comment on lines -274 to +299
for (i = 0; i < donation_trend.length; i++) {
donation_trend_f.push([Date.UTC(donation_trend[i][1],donation_trend[i][2]-1,donation_trend[i][3]), donation_trend[i][0]]);
}
const chartValues = Array.prototype.concat(balance_trend, debt_trend, expend_trend, donation_trend)

// format expenditures
for (i = 0; i < expend_trend.length; i++) {
expend_trend_f.push([Date.UTC(expend_trend[i][1],expend_trend[i][2]-1,expend_trend[i][3]), expend_trend[i][0]]);
}
const startYear = chartValues.reduce(function(prev, current) {
return (prev && prev.year > current.year) ? current : prev
}).year

const endYear = chartValues.reduce(function(prev, current) {
return (prev && prev.year > current.year) ? prev : current
}).year
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculate start and end year across all chart data, to ensure consistency.

Comment on lines +972 to +989
period_end = {
"description": description,
"year": end_date.year,
"month": end_date.month,
"day": end_date.day,
}
balance_trend.append(
{
"amount": closing_balance,
**period_end,
}
)
debt_trend.append(
{
"amount": total_unpaid_debts * -1,
**period_end,
}
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now returning lists of dictionaries, in order to pass filing description through to the view. I don't see that trends is used anywhere else, besides the deprecated races view.

@hancush hancush requested a review from fgregg October 2, 2024 20:58
Copy link
Member

@fgregg fgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking good, but why did you turn them back into area charts?

@@ -114,7 +114,7 @@ ChartHelper.netfunds = function(el, title, sourceTxt, yaxisLabel, data, startYea
return new Highcharts.Chart({
chart: {
renderTo: el,
type: "line",
type: "area",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did you turn this back to area?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops!

Comment on lines +41 to +64
continuous_interval_query = """
CREATE MATERIALIZED VIEW {transaction_type}_by_{interval} AS (
WITH {interval}_summaries AS (
{summary_query}
)
SELECT
continuous_interval.entity_id,
continuous_interval.{interval},
COALESCE({interval}_summaries.amount, 0) AS amount
FROM (
SELECT DISTINCT
entity_id,
generate_series(
MIN({interval}) OVER (PARTITION BY entity_id),
MAX({interval}) OVER (PARTITION BY entity_id),
'1 {interval}'
) AS {interval}
FROM {interval}_summaries
) continuous_interval
LEFT JOIN {interval}_summaries
USING (entity_id, {interval})
ORDER BY entity_id, {interval}
)
"""
Copy link
Member Author

@hancush hancush Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have transaction info for a candidate in Jan, March, and April, then we can infer from the absence of transactions that they clocked a total of 0 in Feb. This query creates a continuous sequence of aggregates for each entity, beginning with the first date we have a transaction for and ending with the last date we have a transaction for. Left joining the summaries then fills in intervals with no transactions with 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is useful in particular for the chart. Given sparse data, a step chart will draw a line from point A to C, suggesting a potentially misleading value for B (which gets worse the more values are missing). This registers months without transactions appropriately as 0.

@hancush hancush marked this pull request as ready for review October 3, 2024 17:21
@hancush hancush requested a review from fgregg October 3, 2024 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants