Please, stop looping over a dataframe

Looping through a dataframe row by row is not something you want to do.

import pandas as pd
t = pd.DataFrame({'a': range(0, 100), 'b': range(0, 100)})

C = []
for i,r in t.iterrows():
    C.append((r['a'], r['b']))

C = []
for ir in t.itertuples():
    C.append((ir[1], ir[2]))

C = []
for r in zip(t['a'], t['b']):
    C.append((r[0], r[1]))

C = []
for r in zip(*t.to_dict("list").values()):
    C.append((r[0], r[1]))

Alternative to .apply() method, yet I haven’t seen a significant benefit.

result = [query_distance(conn, route, first_visit, last_visit)
          for route, first_visit, last_visit in zip(rs.RouteID, rs.FirstVisitTime, rs.LastVisitTime)]

Another example for ‘apply method vs list comprehensions’. This one uses to_records() which generates rec.array which is an awesome thing.

ra_expanded.apply(lambda x: query_collected_bins(conn, x.RouteID, x.StartTime, x.EndTime), axis=1)

records = ra_expanded.loc[:,["RouteID", "StartTime", "EndTime"]].to_records()
[query_collected_bins(conn, int(x.RouteID), x.StartTime, x.EndTime) for x in records]

Wanna LAG? Use shift function.

df['col_diff'] = df['col'] - df['col'].shift(1)

If you need difs you can use .diff() too.

df['col_diff'] = df.col.diff()

SIMILAR POSTS

published on 10.07.2022

Previously, I’ve published a blog post about deploying static content on heroku with basic authentication. The main purpose was to get basic auth for a freely hosted static website. In that post, we hosted the source code on GitLab and configured a CI/CD pipeline to render the static content …

published on 28.05.2022
edited on 02.02.2026

Each git commit has a field called Author which consists ‘user.name’ and ‘user.email’. We usually set these variables once, after installing git, with git config --global so that each repo gets the variables from the global definition. We can also set them locally for a …

published on 25.05.2022

In this post, I’ll first walk through hosting static content with basic authentication. Then, we’ll look into deploying to Heroku using GitLab Pipelines, more specifically deploying a certain sub-directory within the project instead of pushing the whole project. Also, I’ll share …

published on 17.04.2022
edited on 02.02.2026

Önceki bölümde, markdown formatını LaTeX formatına dönüştürmek için kullanılan Pandoc yazılımından bahsetmiştik. Şimdi konuyu bir adım daha ileri taşıyıp ve bookdown’a geçiyoruz. Bookdown; Rmarkdown kullanarak teknik dökümanlar, kitaplar yazabilmemizi sağlayan, Yihui Xie tarafından yazılmış …

published on 10.04.2022

I have been using WSL-2 on Windows for over a year. It’s very useful because some Python packages are just a headache to install on Windows. Also, docker. It’s just better on Linux. Yet, WSL-2 can also be problematic. I remember trying a dual-boot setup when things just went way too much …

TAG CLOUD