Is Python good enough?
Published:
My two cents about Programming Languages
In the world software engineering, picking the right programming language is super important. It decides how well your program runs, how reliable it is, and how much it can handle. Some languages, like Assembly and C, are like building blocks for super fast and crucial stuff. But then, there are others built on top of them like Python, Ruby that make coding easier and more powerful. It’s like going from playing with Lego blocks to creating intricate structures that can be combined to build even bigger structures.
Softwares are everywhere for an instance, Microsoft Windows is one of the most popular OS in the Desktop PC’s holding 72% of the market share. Windows and it’s kernel is developed on top of C and Assembly. This enables them to have control over critical hardware components of the computer by talking directly with components like memory, CPU and etc which is necessary in a software like OS.
Check popular softwares and which programming languages were used to implement them
The above graphic shows the abstraction levels of coding in SE. Starting from the hardware level 1s and os’ in Machine Languages and then the Assembly, which is used by Higher level languages to talk directly with Hardwares. This is a simple abstract representation of what’s behind a complete Hardware&Software system.
Why High level exist?
Low level languages had some limitations in the early generations which led people to build Higher level languages on top of existing ones.
Lack of abstraction, Language like Assembly is very hard to grasp for a software engineer unlike other higher level language like python, where you can start coding without digging deep into the fundamentals. The following two code snippets shows the different between printing “Hello World!” in two different languages, Assembly and Python. Professionals needed to utilize Computational power without having burden to spent hours learning languages like Assembly. thus rise languages like JS, Python, R which are very developer friendly.
Assembly Language
section .text ; declare the .text section
global _start ; has to be declared for the linker (ld)
_start: ; entry point for _start
mov edx, len ; "invoke" the len of the message
mov ecx, msg ; "invoke" the message itself
mov ebx, 1 ; set the file descriptor (fd) to stdout
mov eax, 4 ; system call for "write"
int 0x80 ; call the kernel
mov eax, 1 ; system call for "exit"
int 0x80 ; call the kernel
section .data ; here you declare the data
msg db "Hello world!" ; the actual message to use
len equ $ -msg ; get the size of the message
Python
print("Hello World!)
Reducing productivity of developers, you have to write more characters and in case of debugging, you need deep understanding of fundamentals. Community support like forums are available but limited due to the lack of skilled developers. This will slow down the productivity, So the trade of having more control of the hardware is not great compared to the time and skill a person has to put into these.
Lack of Skilled developers, it’s very hard to find skilled enough people who can code in low level. Only few developers tends to start their SE careers in low level coding. Most institute move towards Java or Python as the base programming language for their courses.
Due to these reasons, modern developers tends to move towards languages like Python, JS and many other higher level languages. Following document/graph shows the programming language popularity based on number of repos pushed to the github over the years and you can see almost all of the popular languages are Higher Languages while Assembly is in lower twenties.
github language popularity based on repos by https://innovationgraph.github.com/
Is Python good enough? 🤔
Let’s address the main question here, Is Python good enough? Python has become the preferred language for many data scientists looking to begin their coding journey. One of its greatest strengths is its simplicity. Its keywords are intuitive, making it easy to start and get things moving. Tools like Jupyter and Google Colab notebooks enable non-software engineers to experiment with Python and swiftly put their ideas into action with minimal setup time.
Python has emerged as the dominant language when it comes to developing ML products. Due to the community support and the rich libraries. Data Scientist has found Python to be more user friendly and flexible.
Latest (latest to open source) model GPT 2 by Open AI model was written in Python. (Could have used other languages like C++ in the actual training process) but this proves that even the best in the industry prefers to develop in Python. But this love for Python comes with a huge performance cap. High level languages like Python run on an interpreter and it refuse to convert Python code to machine code thus hurting it’s performance. The following study shows how different programming languages perform on same tasks.
A Comparative Study of Programming Languages in Rosetta Code
How to break the performance barrier in Python (GIL 🥲)
GIL (Global Interpreter Lock) is a feature in Python which allows multiple threads(in a single CPU) to access shared state preventing conflicting between threads. While the GIL offers benefits like simplicity and ease of implementation, thus it limits Python’s ability to achieve full multi-core performance by restricting true parallel execution of multiple threads. Due to the GIL python will executes it’s threads only on one processor even though modern CPU contains multiple logical processors which can be utilized for computations.
A year ago, Guido Van Rossum delved into Python’s future during his conversation with Lex Fridman on his podcast. He hinted at a potential future where the Global Interpreter Lock (GIL) might be eliminated from forthcoming versions of Python. But for now that’s not the case GIL is here to say, so we need to find alternative methods.
There’s few different paths that can be taken to lessen the performance barrier in Python. Main method used by developers is to invoke multiprocessing by creating classes that can take independent arguments and compute independently. This will allow the python interpreter to generate multiple processors in the CPU and achieve better performance ideally × N performance (N: Number of cores ideally) Multiprocessing module in Python
Other method is take the functionality and write it using C or some other lower level language that compiles to machine level code that has better resource management. Rust is a great language that sits in a sweet spot between high level and better control over hardware. More on “How to implement a Rust Library in Python” will be discussed in a another article though there are many youtube videos that has done on this topic which you can check. Until then Good Bye.
Acronyms: