Sunday, June 16, 2024

Adventures in Computations: Part 1

 I've decided to learn object-oriented programming again after a long hiatus from C++, MATLAB and whatever else came in the way after that. 

In feel in a sense that the world is moving by and a prime skill necessary for the implementation of useful programs is knowing how to use a high-level language that can be a swiss army knife for different uses. In particular, I'm interested in tackling data science problems without having to spend inordinate amounts of time learning different coding platforms. 

After some investigation, particularly into what's been going on in the last 2 years, I've settled on Julia to learn scientific machine learning. It's an open-source platform with the "functionality, ease of use and intuitive syntax of R, Python, SAS or Stata combined with the speed, capacity and performance of C, C++ or Java" as per the statement on julialang.org

That sounds almost too good to be true so I had to learn this new tool. 

My strategy would be to spend blocks of time everyday, 15 minutes or maybe even upto an hour, coding and learning new tricks. I've decided to implement the ideas in MIT 18.S191/6.S083/22.S092 "Introduction to Computational Thinking". There are 3 versions now available. The course from Fall 2023 appears to have more interactivity and learning paths built into it. index — Interactive Computational Thinking — MIT so I'm going to be following that to be in touch with the latest and greatest on the subject. Thank you to MIT for making this resource open to the public. 


Why is Julia fast?

Because it was designed to be in the sweetspot of "fast" and "productive". 




Benchmarks

Julia Micro-Benchmarks (julialang.org)


Julia Cheatsheet

The Fast Track to Julia (juliadocs.org)


A Bit about Julia Environment

  • Its a scripting language. 
  • It creates executable code from scripts without a separate compilation step.
  • Code is compiled using low-level virtual machine (LLVM).
  • Runs at speeds similar to other compiled languages, such as C/C++ and Fortran.
  • 85% written in Julia called base, remaining 15% termed the core, written in C and compiled into a shared object library or a DLL in windows.


Definitions

Data science is the study of the generalizable extraction of knowledge from data. It incorporates varying elements and builds on techniques and theories from many fields, including signal processing, mathematics, probability models, machine learning, statistical learning, computer programming, data engineering, pattern recognition and learning, visualization, uncertainty modeling, data warehousing, and high-performance computing with the goal of extracting meaning from data and creating data products.


Some videos 

The idea to eliminate the "two-language" problem:


No comments: