Python: Libraries you should use — Part-1

Gateway to Efficiency

Pravash
4 min readMay 15, 2024

Hi, Welcome to this article of exploration into Python’s vast library ecosystem! In this segment, I will share some of the coolest Python libraries which have become essential in my coding journey and that I think you might don’t know about them.

Let’s dive in —

1. result

The concept of result introduces the capability for parallel path programming. It facilitates the handling of errors represented through abstract data types, commonly known as monads or monadic structures.

These offer an alternative approach to error handling compared to using exceptions. Many individuals prefer this method over Python’s exception handling, as exceptions tend to disrupt the main program’s control flow.

This approach draws inspiration from functional programming principles. Instead of solely returning an alternative result value or raising an error, it allows for the return of a result value, which can be either Ok(value) or Err(error). This design fosters clearer error management within the program’s structure.

Example —

using try/except block:

def divide(a: int, b: int) -> int:
try:
if b == 0:
raise ValueError("Cannot divide by zero")
return a // b
except ValueError as e:
return str(e)

values = [(10, 0), (10, 5)]
for a, b in values:
result = divide(a, b)
if isinstance(result, int):
print(f"{a} // {b} == {result}")
else:
print(result)

Using result:

from result import Result, Ok, Err

def divide(a: int, b: int) -> Result[int, str]:
if b == 0:
return Err("Can't divide by zero")
return Ok(a // b)

values = [(10, 0), (10, 5)]
for a, b in values:
match divide(a, b):
case Ok(value):
print(f"{a} // {b} == {value}")
case Err(e):
print(e)

2. loguru

This library is for logging, but in simplified way. Here you don’t have to create logger object. By default it add a bunch of useful functionalities that solve caveats of the standard loggers. That makes it easy to use as just as simple print statement.

Example —

from loguru import logger
import sys

# logger.debug("Simple logging!")
message = "Simple logging!"

## If you need to output the log to a file
logger.add("s_file.log")
logger.debug("That's it, simple logging!")

## Multi-process security
logger.add("somefile.log", enqueue=True)
logger.debug("That's it, simple logging!")
logger.debug("Just simple logging!")


## Handler/Formatter/Filter
logger.add(sys.stderr, format="{time} {level} {message}", filter="my_module", level="INFO")

## backtrace support
logger.add("out.log", backtrace=True, diagnose=True)

def check_div(c):
try:
val = 10/c
except ZeroDivisionError:
logger.exception("What?!")

check_div(0)

3. pendulum

You can this is one of the cool utility to deal with date and time and also computation of date and time.
Since dealing with date and time becomes really difficult when you have to deal with different timezones, daylight saving, leap year, etc. With Pendulum its really easy to manipulate dates.

Example —

import pendulum

now = pendulum.now("Europe/Paris")

# Changing timezone
now. in_timezone("America/Toronto")

# Default support for common datetime formats
now.to_iso8601_string()

# Shifting
now.add(days=2)

# Localization
dt = pendulum.datetime(1975, 5, 21)
dt.format('dddd DD MMMM YYYY', locale='de')
'Mittwoch 21 Mai 1975'
dt.format('dddd DD MMMM YYYY')
'Wednesday 21 May 1975'


pendulum.set_locale('en')
print(pendulum.now().add(years=1).diff_for_humans())

# o/p = in 1 year

As you can see, I have defined a particular date in Europe and then change it different timezone very easily. You can shift the days by adding to it.

Another cool part is, it offers human readable dates and time that is localization.

4. xarray

Provides data to work with labelled arrays and datasets. Its great for scientific computing , data Analysis. Its also work in conjunction with pandas, numpy.
If you are dealing with complex multi dimensional data, xarray is a good choice

Example —

import xarray as xr
import pandas as pd

# Create sample Xarray dataset
data = {
'temperature': (('time', 'location'), [[25, 30], [28, 32], [26, 31]]),
'humidity': (('time', 'location'), [[60, 65], [55, 58], [62, 67]]),
}
coords = {'time': pd.date_range('2022-01-01', periods=3),
'location': ['New York', 'London']}
xarray_ds = xr.Dataset(data, coords=coords)

# Convert Xarray dataset to Pandas DataFrame
pandas_df = xarray_ds.to_dataframe()

print("Xarray Dataset:")
print(xarray_ds)
print("\nPandas DataFrame:")
print(pandas_df)

5. dlt

So dlt is a library where in the easiest form, you shoot response.json() json at a function and it auto manages the typing normalisation and loading. In its most complex form, you can do almost anything you can want, from memory management, multithreading, extraction DAGs, etc.

Why use it

Maintenance becomes simple with short declarative code, Run it where Python runs, User-friendly, declarative interface.

Example —

In the below example I have used duckdb to write the response from an API.

import dlt
from dlt.sources.helpers import requests

url = "https://api.github.com/repos/dlt-hub/dlt/issues"

def get_response():
# Make a request and check if it was successful
response = requests.get(url)
response.raise_for_status()
print(response.json())
return response.json()

if __name__ == '__main__':
# configure the pipeline with your destination details
pipeline = dlt.pipeline(
pipeline_name="github_issues",
destination="duckdb",
dataset_name="data",
)

# The response contains a list of issues
load_info = pipeline.run(get_response(), table_name="test")

print(load_info)

dlt automatically turns JSON returned by any source into a live dataset stored in the destination. It does in 3 steps -

Extract — The script extract the data from the source and then parses the json and pass it as input to dlt.

Normalize — dlt recursively unpacks this nested structure making it ready to be loaded. This creates a schema, which will automatically evolve to any future source data changes

Load — The data is then loaded into your chosen destination. dlt uses configurable, idempotent, atomic loads that ensure data safely ends up there.

Conclusion

I hope you enjoyed this list of python libraries. I used almost all the library in my projects and yes they are super helpful.

Python offers a rich ecosystem of libraries that cater to various needs and domains. Whether you’re working on data analysis, machine learning, web development, or scientific computing, there’s likely a library available to streamline your tasks and boost productivity.

So, don’t hesitate to dive into the vast Python library ecosystem and make the most of these invaluable resources for your projects.

--

--

Pravash

I am a passionate Data Engineer and Technology Enthusiast. Here I am using this platform to share my knowledge and experience on tech stacks.