Calgarypuck Forums - The Unofficial Calgary Flames Fan Community - View Single Post - [GT] US election

Lanny_McDonald · 01-08-2021, 06:17 AM

Quote:

Originally Posted by Abatedmean

Nah, just personal stuff. I would love it if Trump's lawyers released his stats to compare it to mine. They obviously have much more data (and time) than I do.

Their work would be a lot more formal than mine as well. For the voter turnout I didn't bother to do a full analysis, I actually just took a linear approximation just to save time. I figured if the odds of it happening are .01% compared to .00001% it wouldn't make much of a practical difference.

A little late to this party but your claims intrigue me. I just have a few questions.

First, about your understanding of statistics and data analysis. What is your specific training in such analysis? To what level did you train in understanding statistics and data analytics? Where did you get your data and what tools did you use to conduct your data analysis?

Secondly, some of your claims are about irregularities in the data acquisition and processing procedures. Did YOU take into account the differences between states and the rules they put into play in regards to data collection and tabulation? The processes were very different from state-to-state so no consistent methodology could applied here to determine if data was acquired in a consistent and reliable manner. Did you predict this and how did it impact your model?

Finally, because of the variation in data collection and the release of data, how did you determine your model to be superior and more accurate than those of the actual bodies who had access to the ballots and had the ability to validate said data? Based on recount data did the final vote tallies remain consistent in your model, or did things change?

The thing that jumps out to me was your reference to the variance in numbers in Pennsylvania after 3:42 AM, like this is your smoking gun of voter fraud. I want to be sure that YOU understood the process of vote tabulation and why numbers would change after that time, which I have a feeling you likely don't?

Each state had different rules for handling and tabulating votes. For example, in Arizona early voting was allowed and mail-in ballots were encouraged. The variance in how you could handle a ballot there was great, even allowing you to drop your mail-in ballot off at a polling station the day of the election. Mail-in ballots were processed and counted as soon as they came in, so the data was available much earlier. Conversely, Pennsylvania was handled very differently. They had very different rules for mail-in ballots and those ballots were held back for counting until the evening of the election. So while Arizona's systems were setup to have those mail-in ballots (the majority of the ballots) counted prior to the polls closing, Pennsylvania could not even begin the process of ballot validation until the polls had officially closed. So the process greatly impacted the flow of data into the system.

Now, to the issue of the variance in data after 3:42 AM. I thought this was pretty easy to follow and was very predictable. Data collection and validation from small populations is easier and quicker than large populations. The rural vote, were a precinct may handle a couple hundred to a could thousand votes is reported quicker than the urban vote, where tens of thousands of people may use the same polling place. Urban centers take longer to count ballots because of the number of ballot coming in. This is consistent across all elections, and is why data can see massive spikes as tranches of votes are reported from urban precincts. For Pennsylvania the biggest factor in these spikes was the mail-in vote, which could not begin to be counted until the ballots were opened and the signatures validated. This is a painstaking process and takes time, so it was predicted we would not see data from the mail-in ballots until the next day, which is exactly what the data displayed.

So as a "data scientist" did you take into consideration the variance in data acquisition and understand how that would impact the flow of data to your model? Did you bother to understand the rules in play and how specific processes and behaviors from state-to-state would completely change how a model would perform? Most importantly, did you go back to the recounts and see if the outcomes were consistent?

I look forward to your explanations about your model and the variations you saw, and then if you did any normalizations to handle these variations or even took them into account as you built your model?