from IPython.core.interactiveshell import InteractiveShell# `ast_node_interactivity` is a setting that determines how the return value of the last line in a cell is displayed# with `last_expr_or_assign`, the return value of the last expression is displayed unless it is assigned to a variableInteractiveShell.ast_node_interactivity ="last_expr_or_assign"
There’s an excellent blog post on why Pandas feels clunky for those coming from R:
However in Python, I’ve found ibis as an alternative to pandas to be a much more natural fit for those coming from R.
ibis uses duckdb as a backend by default, and its API is a mix between duckdb and dplyr.
import ibis
_ in ibis is a special variable that refers to the last expression evaluated this is useful for chaining operations or for using the result of the last expression in subsequent operations
from ibis import _
By default, ibis defers execution until you call execute(). Using ibis.options.interactive = True will make it so that expressions are immediately executed when displayed. This is useful for interactive exploration.
ibis.options.interactive =True
Let’s also import pandas to compare the two libraries.
import pandas as pd
Here’s the equivalent code in pandas and ibis for the example provided in the blog post:
For this last example, we have to resort to calculating the median after a group by operation over each country and then join it back to the original DataFrame to replace the outliers. This is similar to the pandas approach.