Sunday, June 16, 2024

Adventures in Computations: Part 1

 I've decided to learn object-oriented programming again after a long hiatus from C++, MATLAB and whatever else came in the way after that. 

In feel in a sense that the world is moving by and a prime skill necessary for the implementation of useful programs is knowing how to use a high-level language that can be a swiss army knife for different uses. In particular, I'm interested in tackling data science problems without having to spend inordinate amounts of time learning different coding platforms. 

After some investigation, particularly into what's been going on in the last 2 years, I've settled on Julia to learn scientific machine learning. It's an open-source platform with the "functionality, ease of use and intuitive syntax of R, Python, SAS or Stata combined with the speed, capacity and performance of C, C++ or Java" as per the statement on julialang.org

That sounds almost too good to be true so I had to learn this new tool. 

My strategy would be to spend blocks of time everyday, 15 minutes or maybe even upto an hour, coding and learning new tricks. I've decided to implement the ideas in MIT 18.S191/6.S083/22.S092 "Introduction to Computational Thinking". There are 3 versions now available. The course from Fall 2023 appears to have more interactivity and learning paths built into it. index — Interactive Computational Thinking — MIT so I'm going to be following that to be in touch with the latest and greatest on the subject. Thank you to MIT for making this resource open to the public. 


Why is Julia fast?

Because it was designed to be in the sweetspot of "fast" and "productive". 




Benchmarks

Julia Micro-Benchmarks (julialang.org)


Julia Cheatsheet

The Fast Track to Julia (juliadocs.org)


A Bit about Julia Environment

  • Its a scripting language. 
  • It creates executable code from scripts without a separate compilation step.
  • Code is compiled using low-level virtual machine (LLVM).
  • Runs at speeds similar to other compiled languages, such as C/C++ and Fortran.
  • 85% written in Julia called base, remaining 15% termed the core, written in C and compiled into a shared object library or a DLL in windows.


Definitions

Data science is the study of the generalizable extraction of knowledge from data. It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high-performance computing with the goal of extracting meaning from data and creating data products.


Some videos 

The idea to eliminate the "two-language" problem:


Wednesday, April 28, 2021

Studying India's Covid-19 Pandemic Response : Part 2

Continued from the first post in this series...


Studying India's pandemic response won't be complete if we didn't concentrate on the factors taking place on a global level which in-turn affects everything down at a country level. 

Let's start with addressing a fundamental issue at the heart of this crisis.

While it is not necessary to go into the statistics of the rise in cases and deaths around the world, one can easily see that the core mathematical pattern behind these is exponential in nature. So simple, that perhaps most people simply gloss over it, not understanding it's implications. 

A famous lecture on exponential growth by Professor Albert Allen Bartlett, an acclaimed professor of physics at the University of Colorado at Boulder, comes to mind. It is required watching by everyone and I think I'm ready to die on this hill. I'll leave that in the supplementary section below.

That a burgeoning world population has outstripped the carrying capacity of the earth is something every kid is taught in school today. This comes with a loss in bio-diversity, with animals like CoV reservoirs (bats) coming into increasing contact with human habitation. This paper takes a full look at this issue. 

Within this context, it would seem that critical systems should be constructed and managed to rapidly adapt and scale up for exponential rise in illnesses. Why? Simply because the adverse exponentials seems to occurring at an increased frequency. In other words, the extremes are becoming more frequent. 

In March, Chatham House hosted a Global C-19 Vaccine Supply Chain & Manufacturing Summit in which the main participants were CEPI (the Coalition for Epidemic Preparedness Innovations) and industry players. For a "summit" of this magnitude, it was telling that WHO was largely absent from the discussions. Not that WHO's leadership hasn't been questioned in this pandemic, to which this piece is a good (long) read.

An opinion piece in the BMJ noted the following after this summit, from which I quote : 

It is not widely known that current annual production of all vaccines in the world is about 5 billion doses. Yet this year the aim is to produce as much covid-19 vaccine as possible to meet projected demand—the organizers estimated about 9.5 billion doses which has never been done before. To date, production is less than 500 million doses so there is a very long way to go. 

Such a scale up of production will put a huge strain on the producers of the many inputs required to produce a vaccine and get it into the arms of millions of people. One participant in the summit said that their vaccine required 280 separate inputs. These range from the biological materials to grow the vaccine through a wide range of technical kit necessary for production to the vials that contain the finished product. On top of which vaccine production requires a range of highly-skilled technical personnel to manage what is, unlike most conventional medicines, a complex biological process—and such personnel are in short supply. 

Given the complexity of the task, and the myriad of different circumstances affecting the multiplicity of input suppliers, it is very difficult to anticipate exactly which critical supply problems will emerge or exactly how each of them might be dealt with. What is clear is that such problems will likely arise, given the unprecedented scale-up, and producers, regulators and governments need to be alert in addressing them.


Is it a surprise that in April, an Indian pharmaceutical giant at the heart of the biggest vaccine production effort in the world calls out publicly for the USA to repeal it's sudden embargo on raw materials and free up the supply chain? The USA seems to have done what any country at the apex of this crisis would have done. Protecting critical vaccine raw materials became a national defense interest in an atmosphere of shortages. So these issues simply do not occur in vacuum, they are all interconnected.

In summary, we have a simple but large-scale lack of appreciation for the exponential.

With delayed responses, the timeframe within which manufacturing and logistics can scale up to meet an exponential crisis is woefully limited. Countries step in to protect their national interests. But a globalized world means you can't simply lock all doors on those who deal with you at many complex levels.

It would seem that an entirely new top-down management system built to deal with the unique nature of these exponential events and high number of moving parts are the need of the hour. 

Despite having all the fancy systems in place, the question of coordination between different players in this vast system also needs addressed. What's the point in so-called "taskforces" if they're not used to maximum effect?  Moreover, there could be players who are there to help, and players there specially ordained to throw a wrench into the works

We'll need to take a look into those aspects of this multi-dimensional issue. But that's for another post. 


SUPPLEMENT

Arithmetic, Population and Energy - a talk by Al Bartlett

Friday, April 23, 2021

Studying India's Covid-19 Pandemic Response : Part 1

India has emerged as the global epicenter of the covid-19 pandemic. Photo courtesy : Independent


Like hundreds of helpless expats, I sit in the middle east taking stock of the seriousness of the pandemic situation in India. For the last 2 days, the number of daily cases have topped 300,000. Hospitals overflowing, health services unable to cope, oxygen supply crunched, crematoriums exhuming bodies in car parks. The positivity rate is an all-time high of 30%. Vaccinations need to be at 10 million a day but moving at a snail's pace of 3 million a day. 

Unfortunately, one is faced with having to wade through 24/7 live ticker news and myriad of opinions and commentary on the situation in real-time. Often, you come across phrases and terms used to address the government's actions, such as "mismanagement", "complacency", "refusal to acknowledge shortcomings", "creaky system" or "rickety healthcare" and so on. 

What is going on?

Management and systems engineering are topics that frankly interest me. Taking a step back and looking at this hot mess from a 10,000ft view, how can an average citizen understand the pandemic response policy, decision aid system and management strategy enacted by the current Modi administration specifically for this pandemic? 

Is it possible there is a resource somewhere that rewinds the tape back to day one and runs through all the actions of the government ? Is there a non-partisan book(s) or website anyone know of? India being home to a great business community and management intuitions, I'm concerned why more people are not looking at this crisis from a high level systems point of view.

Of course, these might be concerns even you have. So worry not. Let's try to pick through this disaster piece by piece and unravel the mess playing out before us. I would first like to share a few resources that act as "primers" for reading (I've also included an interview).


Primer :

1. What is the nature of the public health system currently established in India? This handy website explores the current health systems in place in several countries. What I like about it is how it goes in-depth into the organizational structures of each nation's health system. Not highly detailed, but just enough. One can look up India and do the necessary reading. https://www.commonwealthfund.org/international-health-policy-center/countries

2. "Combating the COVID-19 pandemic in a resource-constrained setting: insights from initial response in India" This is a neat analysis of all actions pursued by the Indian government in the first 4 months of the first wave of the pandemic in India. A SWOT analysis of those actions are also contained in the paper. I thought it was very systematically researched. https://gh.bmj.com/content/5/11/e003416

3. "A critique of the Indian government’s response to the COVID-19 pandemic". Self explanatory. An Indian economist exposes where the pandemic response has fallen short. Lots of points to take stock of. One needs to face these questions head-on. https://link.springer.com/article/10.1007/s40812-020-00170-x

4. "Modi Leadership Style Main Reason for India's Covid Mishandling" An interview with Indian economist and historian Ramachandra Guha suggests the principal blame and responsibility rests squarely on the Prime Minister’s shoulders. The analysis is on the leadership flaws in the highest man in power and the "yes men" built around him.  https://www.youtube.com/watch?v=AFVmsRmFE4Q

5. The EPC mess-up with the medical oxygen I leave two articles and the original bid that was floated here : a) https://scroll.in/article/992537/india-is-running-out-of%20oxygen-covid-19-patients-are-dying-because-the-gov%20ernment-wasted-time  and b)  https://www.thenewsminute.com/article/how-kerala-managing-its-medical-oxygen-supply-147579 . The Request for Bid from the Government : http://www.cmss.gov.in/sites/default/files/PSAPLANTTENDERDOCUMENT.pdf

6. "Ten scientific reasons in support of airborne transmission of SARS-CoV-2" https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(21)00869-2/fulltext

7. "Biological Risks in India: Perspectives and Analysis" This is a recent and noteworthy treatment of biological risks in India and stresses why India needs a firm 24/7 bio-disasters policy and purpose built institutions in place. https://carnegieendowment.org/2020/12/09/biological-risks-in-india-perspectives-and-analysis-pub-83399


Once we are done sifting through these articles and getting a handle of the problem, we can address sub-topics. That will be the subject of the next series of posts, which I hopefully will do soon. Thanks.

*  *  *

Sunday, February 7, 2021

Reverse Engineering Zwift Physics - A Fun Look

 INTRO 

Last year, I performed a simple exercise to reverse engineer a cycling ride done on Fulgaz app to understand in-game variables being employed. I described it in this post. It was nice to receive a message from Fulgaz suggesting that I'd come very close to what they actually use for their sanctioned events. 

Overall, the Fulgaz app seemed very in-tune with the physics behavior we cyclists normally expect to encounter with cycling outdoors. Zwift, on the other hand is tricky. None of the in-game physics constants and parameters are known to the public. 

Given many of us outdoor cyclists find the discrepancies between in-game Zwift physics and real world confusing, it is apt to ask whether Zwift is even based on an "earth" model.

With that fun question - which world is Zwift in? - I make an attempt to understand what makes the app work the way it does. Its far from perfect but hopefully it simulates your own thinking and you can go off and try to do something similar. 

Just don't send me any hate mail. My approach maybe far from perfect.


MODELING TOOL

The modeling tool is the same one I used to model the Fulgaz performance. It takes more than 30  variables and allows breaking a course into many many small segments for steady speed analysis. 

For Zwift, I included some adjustments where the model would accept different g constants, drag co-efficient of bikes, altitude de-rates to power depending on whether a cyclist is acclimated or not, segment-by-segment rolling resistances and drag areas depending on the heading of the cyclist, direction of wind and the known characteristics of the road. 

In short, there's a lot of "handles" that I can pull in order to understand the whacky world of Zwift.


METHOD

I rode one lap of Mighty Metropolitan using my real bike hooked up to Computrainer. The in-game bike chosen was the 2021 Canyon Aeroad with Zipp 202 wheelset. My weight and height were set to 64.8 kg and 173cm respectively. The bike weight was assumed to be a race-ready 6.8kg, which came off some forums on the internet. 

Fig 1 : A crazy twisty, windy course! Course details on Veloviewer

Power output was dual recorded. Primary source of power was dual-sided power pedals while Racermate maintained the load at 150W. This would also yield second by second transmission differences between the two sites of power application depending on where on the course I changed a gear or my cadence. Trainer load was held at a constant 150W. I freely chose a cadence.

Using a script for processing the GPX file, the course was broken into 60 segments in order to capture all the features of the terrain - flats, downhills and uphill sections. Specific sections that had the glass or road surfaces were marked for rolling resistance adjustments.

Model variables were tweaked as far as practical to match the model segment time to the Zwift recorded segment time and speeds which came off the raw GPX file. The strategies being :

1) Minimize the difference between average model speed and Zwift reported average speed for the course. 

AND/OR

2) Minimize the difference between segment model speed and Zwift reported segment speed individually for uphills, flats and downhills. 

The exercise seemed to often be a tradeoff between the above two. In principle, the two scenarios should be intertwined but the way I derived segment-segment information from the GPX data could have led to some errors. For example, the average speed in a segment was derived from the data which may have been acutely affected by outliers within that segment. 

The best solution would balance the accuracy in average course speeds with the match within each of the segments. I ran a few scenarios to check the sensitivity of the model. 


ASSUMPTIONS

A CdA of 0.22 sq.m was modeled from frontal area from Bassett et al. (Med Sci Sports Exerc 1999; 31:1665-1676) using my height and weight and co-efficient of drag Cd from Heil was computed as : 

Cd = 4.45 x mass (kg)^-0.45

CdA factors were set to 90% on the flat and downhill sections assuming an aero position but 100% on the uphills. 

Is this CdA representative of the Zwift world? Probably. Earlier, I did some Aero testing using the Chung "regression method" with the same bike and weight settings on another course called Queen's Highway. The CdA that resulted was around 0.21 sq.m (Fig 2). The Crr values were bonkers so instead I tried the Chung "virtual elevation method" and achieved better results (Fig 3). 

In the VE method, the CdA was more like 0.0028 sq.m and Crr around 0.0032. To achieve that, I had to constrain Crr to 0.0032 and used Goal Seek to hone in on the CdA value in order to get the virtual elevation to match the expected elevation profile. Since CdA and Crr have an inverse relationship with each other, I can only find out how sensitive CdA is to a given Crr by changing the Crr value. For this exercise, I simply fixed Crr to 0.0032. 

Fig 2 : (click to view) Results of aero testing done by Chung method. Spreadsheet courtesy of Alex Simmons , Google Wattage Group


Fig 3 : (click to view) Results of the Chung virtual elevation method. Spreadsheet courtesy of Alex Simmons, Google Wattage Group.


From the above results, it appears my simulation tests using CdA between 0.22 and 0.29 sq.m and Crr of 0.003 was alright. However, I'm pretty unsure of modification factors used for the uphill, downhill and flats. Slope is hardly the reason for a rider to change his bike position, infact it must actually be speed that sets the body position. However, I've assumed speed to be low on the uphills to increase the CdA factors and high on the downhills and flats. 

The Crr of tarmac was chosen as 0.003, the same I used for Fulgaz. A Crr of 0.003 is representative of the performance of a high quality tire on a smooth road. Note that the Crr for glassy segments of 65% of tarmac is arbitrary and hypothetical. 

Other factors like acceleration due to gravity, relative humidity, altitude and air temperature in Mighty Metropolitan etc were all based on data from NYC. 


RESULTS

I've attached the results from the tabulation below. 

Fig 4 : (click to view) Tabulation of different case runs with the variables chosen to run the model. The first two runs are representative of earth while the rest are whackier attempts with whimsical variables.


Fig 5 : (click to view) Model performance compared to the "virtual" performance in Zwift for segmented distances of the course using the given variables in Case# 1. 


DISCUSSION

As you can see through Fig 5, the model did well to bring down the overall error in overall course performance time and speed but seemed to struggle with matching Zwift recorded speed and time data for segments. In the best "earth" scenario Case #1 (see Fig 4), the model flew downhills but rode uphills and flats slower. But trying to make segment performance improve had a tradeoff on the course average performance as shown in Case #2. 

The segment specific speed and time matching were not that great. This could also stem from the errors in speed and time calculation within the segments themselves, which in turn stems from irregularities in the original GPX file. Moreover, since the GPX file is a continuous speed run of my avatar with one segment's speed input being linked to the output from a previous segment (such as steep downhill speeds leading to a climb), the transient nature would probably differ from a purely steady state analysis from the model. 

In other whacky attempts (#3-#6), I employed a "non-earth" scenario by manipulating air density and acceleration due to gravity. The best whacky attempt (#6) provided me a near identical error in course average performance speed and time as Scenario #1 but with improvements in the segment specific matching. These results are "whacky" because a change in the acceleration due to gravity and air density should also affect the CdA values, in other words they are all dependent on each other. However, I've ignored that obvious complexity and stopped further attempts here. 


CONCLUSION

I tried to reverse engineer the physics variables and parameters set in the whacky world of Zwift. The model came close but a closer match can be achieved only with inputs of whimsical numbers. A purely steady state type cycling power model is perhaps not best to use for matching to running-segment data however overall, the model course time matched closely with Zwift course time. 

If you don't know what to make of this fun attempt, don't worry. I'm just as puzzled how the world of Zwift works. And perhaps I just said it. The world of Zwift may not even be a world on earth. We would perhaps like to assume so, but it might just be the case that we're in a parallel world as earth with similar names of cities and hypothetical physics. Perhaps those flying cars and glassy climbs where riders seem to have the ability to climb at 40kph and descend at 80kph are enough proof.

As a cyclist who is clearly in-tune with the world around him and how his bike rides in that world for over 15 years, the way Zwift overreports speeds on flats and downhills seems overly flattering. That said, I love my in-game CdA and rolling resistances and whatever other Zwifty physics constants there might be. The enjoyment of Zwifting far overrules the eccentricities of this software.


REFERENCE :

*  *  *

Sunday, September 20, 2020

The Race of a Lifetime : Tadej POGAČAR's Stage 20 Time Trial Analysis

Photo courtesy : Steephill.tv


Great comebacks are always a fascination for sports observers both from an entertainment and statistics perspective. Don't we all thrive for that moment when fortunes can be reversed and the underdog can win? In social psychology, this phenomenon even has a special name - schadenfreude.

Such a reversal in fortune happened during Stage 20 of the 36.2 km individual time trial at the Tour de France when a 21 year old Tadej Pogačar reclaimed nearly 2 minutes over his nearest rival Primoz Roglič, all but securing the title of the coveted yellow jersey and taking home 500,000 Euros in hard won prize money.

This was a rare feat to witness 20 days into the 3500 km Tour de France, and many had made up their minds that 57 seconds was a large chunk of time to win back from a highly motivated Primoz who had been in sitting in yellow for 11 days in a row. In the aftermath, the sport's pundits are going to be looking closely at how this was accomplished by the youngster, who beat just about every veteran of the time trial format available to contest that day.

Allow me to devote a brief section below to the analysis of the actual time trial performance and the corresponding power demands without going too much into the mathematics of it all. Please note this analysis remains to be validated since the official performance data from Team UAE Emirates is unavailable to the public as of today. Sources of my information are highlighted below and where required, educated guesses are employed. I also discuss my results towards the end of the article.


Assumptions & Considerations

I've used the following assumptions & considerations in this first order analysis :

  • Weight/Height : 66 kg/176 cm (Source)
  • Assumed Drag Area, CdA : T1/T2/T3/Finish = 0.22/0.24/0.3/0.3 sq.m (arbitrary but educated)
  • Assumed Rolling Resistance Co-efficient, Crr : 0.002-0.0023, 25mm width (Vittoria Corsa tubeless)
  • Assumed drivetrain efficiency : 98%
  • Bike T1-T2 : TT bike w/ rim profile 60mm/Full Disc at 8.3 kg
  • Bike T2-Finish : Road bike w/ rim profile 30mm/30mm at 6.8 kg (current UCI limit, source)
  • Gear : Aerodynamic skin-suit and streamlined TT helmet
  • Weather : Historical weather for 3-5pm local French time w/ winds 8.5-12 kph at 93-105 degrees range.
  • Roads : Good (smooth asphalt) w/ mountainous terrain
  • Course GPX source : Ritchie Porte's Strava data  
  • Performance time data : Pro Cycling Stats 
  • Model used : A widely cited & validated general purpose model of human power requirements in cycling
  • Secondary power data for comparison : Thomas de Gendt's Strava data for Stage 20


Method

The race course was broken up into 4 segments corresponding to the official time checkpoints for the stage. A 1st order physics model was used in combination with official timings at those checkpoints to reverse calculate a suitable matching power output. I quote "suitable" as the numbers could change up or down depending on the actual conditions. From the potential locus of power outputs, this is a workable number for the rider, as I validate it below.  

Stage 20 ITT course profile


Results

The modeling indicates that for the first two sections totaling 30.3 km, the use of a special purpose TT bike weighing in at an assumed 8.3 kg and a body shape of CdA 0.22 sq.m required an average power output of 427 Watts. The results indicate a positive split with an average power of 451 Watts for the 1st segment until T1 and 402 Watts for the 2nd segment T1-T2. 

In the vicinity of T2 at 30.3 km, a bike change happened where the TT bike was exchanged for a lighter road bike due to requirements necessitated by the gradient. This climb is at an average 8% gradient, kicking up to 20% in places. The bike change cost anywhere from 6-8 seconds in total, depending on how you start and stop the watch. This time cost is factored into the overall performance time. 

Thus, in the last 5.9 km of this climb, the use of the assumed 6.8ckg road bike required an approximate average of 412 Watts at an estimated 6.2 W/kg (power to rider weight). The power demand for T2-T3 and T3-Finish of approximately 3.3 and 2.6 km each were 432 and 392 W (6.5 & 5.9 W/kg respectively).

The results are plotted in the image below :

(Click to zoom) : Actual performance times along with corresponding modeled average power outputs for Tadej Pogacar in the final individual time trial of Stage 20 of the 2020 Tour de France.



Discussion

This is an unverified analysis done based on checkpoint timings obtained from Pro Cycling stats and other publicly available information. An average power output of 419 Watts was required for this performance as per the modeling. What is definitely in question is the pacing profile over the course of time duration, which needs to be validated with real data.

Such a power output is not totally unrealistic for Tadej, given we know that in the 140km Mountain Stage on  Stage 8 of the Tour, he displayed a power output of 428 Watts over the Col de Peyresourde, climbing it in one of the fastest times recorded in recent history and an estimated power to weight ratio of over 6.5 W/kg. This was after 2 massive climbs before it and 120 km in the legs.

The modeled power output of 412W on the final 5.9 km climb equates to a power to weight ratio of 6.2 Watts/kg. Compare this to Thomas de Gendt's data from the same stage where he rode with an average of 405W at a power to weight of 5.9 Watts/kg. This is consistent with Thomas' performance data that shows he climbed 1:51 minutes slower than Tadej. 

The overall data indicates a positive split of power across elapsed time duration. I justify this with two potentially valid points : 

1. High motivation at the start, giving the rider the urge to ride hard in the first half. Tadej was in fact chasing what looked like an improbable target, a 58 second deficit to win the Tour de France. He might have purposely fired all cylinders, thus accounting for a potential loss of valuable seconds later during the bike change and any other unforeseen events on the climb. 

2. The decrease in power output in the second half might be attributed to a combination of accumulated fatigue and/or a change of the power demand and the impact on feelings from a sudden change to a lighter bike on a steep climb. The "sudden" change to a new bike and the lack of objective power data from an absent head unit meant that Tadej had to guage his effort carefully. It could be that despite a drop in power and cadence, Tadej maintained the "same" or even "greater" level of perceived effort compared to previous flat sections of the course. However, this is just my speculation.

The choice of tire rolling resistances and drag areas although arbitrary, are not a totally wild guess. We know that Team UAE Emirates is sponsored by Vittoria in 2020, the tubeless varieties of which have reportedly exhibited some of the lowest rolling resistances at race speeds. Therefore, I have started off with an ideal case of 0.002 increasing this to 0.0023 at the climb. I figured the weaving on the climb at slow speeds combined with the quality of road on the gradient poses less than ideal conditions, justifying the small increase to Crr. 

Reported co-efficients of rolling resistance for some bicycle racing tires at race speeds. Source : Aerocoach


Professional TT riders are known to be slippery, exhibiting well under 0.25 sq.m of drag area in ideal conditions (smaller riders reportedly presenting less than 0.2 sq.m!) I have started off with an ideal scenario of 0.22 sq.m in the TT position due to Tadej's height and weight, increasing this to 0.3 sq.m on the climb which corresponds to a climbing position adopted with the hands on the hoods. Again, these numbers are arbitrarily chosen but there is no way at present to verify what the real numbers in open terrain might be. I do have some references from a Twitter conversation to believe that my choices are conservative for a top professional rider. 

CFD simulation results showing the individual contributions of wheels, bicycle and rider to CdA as well as the net CdA. Source : Fabio Malizia, Katholieke Universiteit, Leuven


The total system weight with rider and all accessories is an unknown. A premium TT bike setup of 8.3kg and lightweight road bike setup of 6.3kg are not unexpected and matches recorded observations on the internet.  However, the weight of his kit, shoes, helmet, bottle etc are unknowns. I have reasons to believe this will be under 1kg in total however the uncertainty in analysis from the final climb will stem from the uncertainty in system weight and rolling resistance. Regardless, the modeled power outputs are likely not very far off from the actual numbers. 


Conclusion

I titled this race as the "race of a lifetime". Indeed, performances like these are hard to come by simply due to the immense difficulty of turning around such time advantages over a pile of fatigue and mental exhaustion 20 days into the Tour de France.

In some respects, Tadej's race performance has been likened to a pivotal moment in 1989 when the American Greg Lemond, bustling with energy and ready to try new technologies, beat the yellow jersey holder Laurent Fignon with the use of aerodynamic gear and in turn, winning the Tour de France. 

Whether Tadej's victory was a matter of such marginal gains at the end of the day is debatable. Yes, two purpose made bikes were used in the time trial in an unusual manner, but this is increasingly becoming common in the top races these days. Moreover, unlike 1989, both Primoz and Tadej were arguably evenly matched in terms of technology, the funding and competent attention required to apply the technology. In fact, on race-day, they both undertook bike changes before the 6 km climb so any small variations in equipment came really down to supply differences from the equipment sponsors.

Did Tadej just ride his usual top race, as he does every time and was it Primoz who slowed and fizzled out? Well, I think that is clear to see. A race is indeed won by someone who slows the least. And what promoted this spectacular fall when the day demanded the best? Whether it was the massive pressure upon his Primoz's shoulders, or whether it was the failure of his power pacing model, or whether it was the fatigue, or ALL of the above, we will not know for sure. 

What speaks to me from this performance is that marginal gains did not win, and something else contributed. Certainly Tadej rode the time trial of his life, and converted the opportunity of a lifetime to a magnificent victory. And I think in that moment, the individual qualities of what makes one rider better than another in the heat of the moment won. It really is a victory for the human element.

Years after his crushing defeat in the 1989 Tour, Laurent Fignon would write that despite getting over it, "you never stop grieving over an event like that; the best you can manage is to contain the effect it has on your mind." I hope that Primoz, as amazing a rider he has been to reach this level, is able to contain the effect of this race outcome on his mind and move on. He has more than a few good years of a top fight left in him at the very top. But an able and worthy opponent stands beside to check that in the form of Tadej Pogačar.

Thanks for reading. Comments and observations welcome below.

Sunday, August 30, 2020

Fulgaz App : Validating Model Prediction & Performance Results

INTRO 

After my self-inflicted Poor Man's Tour de France that ended on 22nd August, I took a week of recovery and lunged into the Fulgaz app's 3 week fundraising campaign called French Tour. This "curated" Tour campaign features most of the celebrated climbs of the Tour de France in the high Alps along with other famous circuits in and around France. With a real-time leaderboard and 381 virtual kilometres with 14000m of climbing over 21 stages, its a challenging event to keep my mind occupied while the actual Tour de France plays out. 

When riding the Tour in Fulgaz, you see a beautiful HD or 4K video of the course taken by a volunteer rider on a high resolution cam. The volunteer would obviously ride the course at their own capacity, so when you use the app, the speed of the footage can be set to "reactive", so effectively it'd speed up or slow down based on how closely you match the recording (for example, 1x, 0.9x, 0.8x or >1x). 

Being new to Fulgaz, I was quite impressed with the app's in-built features and sliders to "tune" nearly everything that would have appreciable impact on the ride - for example system weight, rolling resistance, drag area, even wind speed and direction. The speed with which the app loads is extremely fast, about a second on my Windows 10 pc with a 32gb RAM. Whats more, you can download all the high resolution videos to stave off any buffering troubles. A download would take 15 minutes for a full HD video on my modest internet connection. 

All this was fascinating, given that a) I'm quite new to indoor cycling apps and b) Zwift, another leading indoor cycling app of which I'm a paying customer, keeps a lot of these variables under tight secrecy so effectively you have little clue what is driving the model. 


MODELING

Sometime ago, I built myself a cycling performance model for personal use. The model is built based on Martin et.al's power model for cycling which I like to use for personal and coaching related estimation purposes. I can build as many segments of a course in the model and make tweaks to inspect how it changes my performance. This is handy for climbing and TT predictions and even drafting simulations. 

For Stage 3 of the French Tour, registrants would have to climb the 1200m vertical Col du Galibier. I was interested in knowing how my performance results in Fulgaz would compare with the model predictions for the same given power input. So I did the Galibier this morning, staying completely aerobic, sweating buckets, powers tuned to steady perfection with a Computrainer erg controller and a second Powertap pedal (those curious how I used the Computrainer with Fulgaz can send me an email or comment). Once I had my performance, I input the same powers into the model along with the driving variables input that I'd input into the app. Results are below. 


RESULTS

Given the assumptions I used (stated in the graphic below), the app performance results and the model predictions converged very well, which I'm pleased with. This gives me further trust in the app. Note how the positive and negative errors negate each other over time. Also note that generally, the errors are within 5% and the average error for 17.9 segments I manually built is -0.26%. 

(Click to view) "Virtual" performance in the Fulgaz app compared to model predictions for given power. The ride was done on a Computrainer using Fulgaz app. Strava results : https://www.strava.com/activities/3984801170

I believe the errors are partly due to :

1) The chosen granularity of the course, which is a km at a "constant elevation". In reality, the road might step up or step down several times within a kilometre. However, for my purposes, this would suffice.

2) I did not ride km segments at "constant power". Infact, I modulated it based on how I felt. 

3) I've assumed a constant rolling resistance per segment of 0.003. If the Fulgaz app changes rolling resistance in real time based on the segment you're on, that could affect the speed slightly. 

Also note that I have not applied altitude power-attenuation ("de-rate") in the model, which is because this is a virtual environment. However, we know for a fact, from both tested runners and cyclists, that aerobic capacity drops at moderate to high altitudes, how much depending on acclimatization levels and individual attributes. So in reality, actual times are very likely going to be slower. How much slower is another conversation. I hope to tackle that in an upcoming post. 


CONCLUSION

The close results from my model and the actual performance on Fulgaz makes the Fulgaz app a reliable training tool, in so far as it is used for constant, steady speed climbing (I have yet to test it for solo TT efforts against the wind). It also validates the Martin et.al model (which has probably been done several times before by several people). When I shared this article with Mike Clucas of Fulgaz, he essentially confirmed that my reverse engineering closely matches the inputs that drive their model, atleast in curated events such as the French Tour. 

This post generally speaks to the need for indoor cycling training apps to make transparent to a customer what is driving their in-game physics. If in-game physics is non transparent, predictions based on widely used open source models will be vastly different to in-app performance. 

If the variables that impact the in-game performance are not transparent, you can't effectively do "what-if" predictions as almost all cyclists do in real life ("if I ride with x equipment and/or shed a few pounds, how would that affect my performance?"). This does not help those who take their training and racing very seriously and like to pre-plan for the event.

One might argue that indoor cycling apps are built like "games", hence the physics can deviate to an extent simply because it is a game. But there can be impacts. Depending on the magnitude of the deviation, a host of things can be affected ranging from perceived exertion, fatigue, CP and W' dynamics and most importantly the nutritional needs that an app based performance requires.  Either-way, if you can't predict something with physics, it's unverifiableunpredictable and might I add, possibly unstable

If on the other hand, all indoor cycling apps used a verifiable model, one could effectively standardize/minimize/account for a source of variability while the rest of the differentiation can be in the graphics, software performance and other perks unique to each app. 

I look forward to actually climbing the beautiful Galibier in reality, if I'm lucky enough to put together my coin collection and go to France. Huge thanks to everyone at Fulgaz for keeping me entertained during this tumultuous time.

-Ron

Sunday, August 23, 2020

The Poor Man's Tour de France : Virtual Stage Racing in GT Mimicry

Readers might recall that last year, I attempted a Poor Man's Giro d'Italia, a tongue in cheek name for a stage racing simulation in which the objective was to follow the Giro while riding "short" stages pretty much everyday by myself on local roads. The main motivation behind the exercise was to collect data and compare them to research studies attempted into Grand Tours and Grand Tour racers. 

I'd wanted to replicate something like that this year but with some additional realism to racing. Obviously for this to happen, the intensities would have to be high and I'd have to race with other people. With the whole Covid-19 situation demolishing the race calendar throughout the world, I turned to Zwift for the obvious solution. 

And thereby, I began another self-inflicted stage racing attempt called Poor Man's Tour de France in July. 

I have a few points to make on this mini-adventure before I share the data :

1. The races began on 17th July and lasted upto 21st August. All race results are recorded in my Zwift power user profile. I started Zwift as a beginner in the E/D category and moved up to C by Stage 13. (To download my data in Excel .csv format, you can click on the plot below in Figure 2 where it links to the tabulated data). 

2. All races were done with a single sided pedal based power meter and a heart rate monitor on a non-smart trainer. 

3. The trainer used was the Feedback Sports Omnium Over-drive unit. This is a roller unit which is extremely portable, and perhaps the most portable of all trainers. Owing to direct contact between tire and roller, rolling friction and the dynamics of tire pressure becomes a bit more important than direct-drive units. The unit has minimal inertia. therefore, there is little to no way to coast during racing. If you stop pedaling, you lose power and stop very quickly. On the plus side, riding with this trainer has considerably improved my pedaling conditioning. Due to the direct wheel-on-roller experience, I was also able to get instant audible feedback on stomping vs smooth pedaling patterns.

4. Due to lack of a direct drive setup, I found I had constraints with the inherent power curve available within the above trainer. With the gearing available to me and that power curve, I rarely escalated past cruise powers greater than 200 Watts, for fear of damaging the rollers (I'd already damaged one earlier this year and lost nearly 3 weeks to have a replacement under warranty shipped out to me from Hong Kong!). This also limited the short maximal sprint power outputs I could display within 300 Watts (I was generally not interested in sprinting) 

5. Choice of daily distances were variable. Terrain type was a mix between rolling hill races, mountain stages, few crits and uphill time trials. I skewed the race stages more towards rolling hilly races. 

6. I felt racing every day on Zwift while maneuvering around time constraints as a parent to a 2 year old wasn't really easy. Therefore, I had a few more recovery days in between stages than what would be standard for a Tour. In general, I didn't exceed more than 3 days without a race but the norm was racing every second day as fatigue started accumulating. 

Figure 1 : The author's "pain cave", cobbled together during Covid-19 shelter in place restrictions in UAE. Materials used : book shelf, baby high chair, normal chair, ironing stand, yoga block, a weighing machine, Feedback Omnium Overdrive trainer, road bike, laptop and a 19 inch wide screen monitor. 


Below is the interactive data for all 21 stages of the Poor Man's Tour. Note that data is presented against a logarithmic y-axis to make the plot more readable. Scrolling over the data lines should show data points.


DATA RESULTS


Table 1 : List of races done during 21 day Poor Man's Tour de France on Zwift

Figure 2 : Data for 21 days from a self-inflicted stage racing simulation called The Poor Man's Tour de France (top to bottom) - Total heartbeats, calculated calories, Zwift reported calories, work done, elevation, Trimp points, average heart rate, bike stress, normalized power, average power, average cadence, distance, RPE, TSS/km, TSS/km, & duration. Note that BikeStress is a training metric native to Golden Cheetah which establishes race intensities as a function of duration and intensity. Click on the line to view the data. 


DISCUSSION OF RESULTS

1. Total Distance, Elevation & Calories : Over the total of 21 stages, I completed 600km of racing with a net ascent of 8666m burning an estimated 12000-13000 kcals. This equates to 17% of the actual Tour de France distance with an elevation gain nearly the height of Mt. Everest. These are modest numbers.

2. Heart rate : The range of racing heart rates were between 151-194 bpm across the 21 stages. The highest heart rates were featured in stages with high stochasticity in pacing effort. For example, the two crits I attempted on Stage 9 and Stage 14 both of which had rolling terrain showed the highest heart rates.  However, there were other crits I attempted which did not feature high heart rates (for example Stage 18). Although the normalized power for those stages were also high, this does not explain the higher heart rates. Perhaps cadence is another factor that might offer a clue, meaning the stages with higher cadence could feature high heart rate. There may also be a hidden con-founder somewhere that is outside of this data (diet, sleep, fatigue, other activities in life...). 

3. Power : In terms of normalized power, the range was from 117W-193W over 21 days of racing. Interestingly, in the early stages, I was just getting used to riding at high intensities on a trainer and not wholly happy with the cooling air flow available to me. So the early stages featured low powers at high heart rates. As the stages evolved, I got fitter in terms of being able to deliver higher power to the pedals at similar heart rates. I also got myself a bigger industrial size fan which could push out more air volume! There was a plateauing phenomena in powers as racing progressed which I attribute to day-to-day fatigue and the inherent power curve limitations of the non-smart trainer. 

4. Aggregate stress : Over 21 days of riding, the aggregate TRIMP based stress was 3008 for a daily stress of 143 AU/day. The aggregate Bikestress (a correlate for TSS) was around 2124, giving a daily figure of 101 AU/day. These were all calculated in Golden Cheetah. Total kilojoules burned was 12036, resulting in an average of 573 KJ/day. On a per day basis, these numbers are higher than the same data from Poor Man's Giro d'Italia. 

5. Distance specific intensity : In terms of TSS/km and Trimp/km, two metrics that maybe indicative of ride intensity as a function of unit distance, the highest values were incurred in stages that featured either a mountain climb time trial, a mountain race or a high intensity crit race. For example, of all the stages, the ones I rode on Stage 3 (L'Etape du Tour Stage 3) and Stage 7 (Alpe du Zwift TT) posted the highest values of intensity per distance. This again agrees with my findings previously from Poor Man's Giro d'Italia and the research data I posted there from the Sanders et.al investigation of Grand Tour racing. With distance and per day stress metrics stated as above, one area of inquiry is whether there are differences in the numbers between indoor and outdoor racing. With constraints of air flow and cooling indoors, one might expect to see higher race intensities indoors. Comparing the Zwift racing data with last year's Poor Man's Giro, the distance specific intensity metric Trimp/km is definitely higher this year on Zwift. However, this is not exactly an apples-to-apples comparison because I did not do a true "stage racing simulation" last year. However, the argument that indoor intensities should be higher is a rational one and something to discuss and further explore. 

6. RPE : Across 21 days of racing, RPE varied from a low of 6 to 10! The hardest I felt was during Stage 2 (L'Etape du Tour) which featured a mountain ascent of 1538m. Because this was one of the earlier stages, I was in no shape to climb continuously for 3 hours with poor air flow. In the final 20 minutes, I did hop off the bike once to take a break, thinking I was going to have a heart attack. Part of the challenge that day in my pain cave was lack of air flow to cool myself for that long! The standard deviation in RPE across the stages was quite low, however, indicating that the intensity on all days were more or less quite similar to each other. 

7. Cadence : My average cadence across all stages was 83 and the highest cadence was during Stage 19 which was a rolling hills ITT of 28 kilometers in length, where I staved off fatigue by riding at 90+ cadence. I reckon the stages with high cadence were excellent stimulus to the VO2max region of training intensities. One of the things I'm pleased with as I attempted this challenge is that I got quite experienced with being able to regulate my cadence to tune my perceived effort within different racing situations. It may have been that this factor also affected my heart rates over the course of each day's race. 

8. Nutrition : In general, being able to race everyday on Zwift means having to rely heavily on carbohydrates; success seems to depend on how well the stores of glycogen are topped up between races. My fuel system is one that is biased towards carbohydrates, which may also be partly explained by the fact that I'm a habitual carbohydrate consumer. There were some days were I didn't have the luxury to manage the diet well to feel fully topped up before the next race. On days where I felt I needed an extra "boost", I used the top ergogenic aid known to man, that's right - Coca Cola! Caffeine works. 

9. Race Competition : In general, I have only good things to say about Zwift as it is a tremendous motivational tool. During the 21 stages, I enjoyed many days sitting in the peloton and sharing the effort that got us all across the line with good timings. But I was in no way a match for those who could utilize some of the "gaming" aspects of virtual racing.

I think Zwift has to figure out some way to weed out sandbaggers. Although the final listings on Zwift Power website excludes those cheating below their actual categories, the race dynamics are affected by the presence of these individuals. For example, it is often the first 2-3 minutes of an e-race where your placement is made or broken due to massive power surges to find position. The presence of more able riders who are cheating below their category could compel others to ride just as hard in order to get on their wheel , as a result many gaps are formed disadvantaging the lower order riders who have "missed the draft". This point maybe moot. 

Figure 3 : The author in a "break" of select group of riders from the Namibian Race League
during Stage 17. 


CONCLUSION

The results discussed in the section above generally agrees with the data found from Grand Tours that the mountain stages are where the action really is in-terms of stress and intensity. Although the race intensities were high and the monotony day to day was also high. Zwift provided a great way to beat that monotony with the ability to select from numerous races spread across different maps with different competitors. For example, I found South Africans and Japanese race subtly different when compared to Brits! Maybe that is an imaginative observation, but it still is an observation. 

Overall, while I found I was making improvements in the duration specific power outputs as the races progressed, I found myself hitting a plateau due to a combination of fatigue and power curve limitations on the trainer. In other words, there were diminishing returns after a point. 

From the Poor Man's Tour de France racing challenge, I was quickly able to learn which e-races suit me and which races wouldn't. Therefore, the choice of many rolling hilly races was intentional. I also included mountain stages. Flat, all-out races were few. 

If I redid the Poor Man's Tour de France again, I'd figure out a way to balance out the percentage of race distance spread between mountains, flats and rollers. But I can't say if the actual Tour de France traditionally or even this year has actually been balanced either! Often we hear that the Tour stages are deliberately designed to suit some of the top French stars. That doesn't seem to be any different this year. 

In conclusion, with the constraints that were upon my time, I think this was sufficient racing stimulus. Due to the plateauing effect of power and the accumulating fatigue as the stages progressed, I had to draw the line somewhere to minimize the losses. 

With a few days left for the actual Tour de France, I will be able to smugly soak in the racing footage and maybe even pretend to co-relate to it with my own experience, ha!