Week 6 at DataraFlow: Learning Resilience Through NumPy, Pandas, and Weather Data Insights
Introduction
Week 6 of my internship at DataraFlow was a major turning point in my data analytics journey.
The focus this week was on NumPy and Pandas — two foundational Python libraries for data manipulation and numerical computation.
Before this week, I had only scratched the surface of what arrays and data frames could do.
By the end, I was confidently loading, cleaning, and analyzing real datasets — but the process was far from easy.
Learning NumPy: My First Real Encounter With Arrays
We kicked off the week diving into NumPy, the backbone of numerical computing in Python.
At first, I found it confusing to understand how arrays worked — especially the idea of dimensions.
Then came our Saturday class with our coordinator, Mr. Winner, who made everything click with one simple but powerful statement:
“You can determine the dimension of a NumPy array by just counting the square brackets at the beginning.”
That single tip changed how I viewed arrays. I could now look at code and tell its dimension before running it.
Here’s a small example that helped solidify my understanding:
import numpy as np
# Creating arrays of different dimensions
a = np.array([1, 2, 3]) # 1D array
b = np.array([[1, 2, 3], [4, 5, 6]]) # 2D array
c = np.array([[[1, 2], [3, 4]]]) # 3D array
print(a.ndim) # 1
print(b.ndim) # 2
print(c.ndim) # 3
`
By simply counting the brackets, I could easily identify dimensions:
[ ]→ 1D[[ ]]→ 2D[[[ ]]]→ 3D
That moment of clarity was a small victory — one that taught me patience and persistence.
Transitioning to Pandas: Organizing and Exploring Real Data
After building confidence with NumPy, we moved to Pandas, where the data began to feel more real.
Our dataset for the week was titled Beijing_PEK_2014.csv, containing daily weather records for a full year.
The goal was to use Pandas to load, inspect, and analyze it effectively.
Here’s how I began the process:
import pandas as pd
# Load dataset
beijing = pd.read_csv("Beijing_PEK_2014.csv")
# Display basic info
print("Shape:", beijing.shape)
print("Columns:", list(beijing.columns))
Output:
Shape: (365, 23)
The dataset contained 365 daily records with 23 variables — temperatures, humidity, wind, pressure, and more.
I then examined missing values and general statistics:
# Checking missing data
missing = beijing.isna().sum().sort_values(ascending=False)
print(missing.head())
# Summary statistics
print(beijing.describe().T)
I noticed that columns like “Events” and “CloudCover” had missing values, while temperature data was mostly complete. This step built my confidence in data cleaning and validation.
Week 6 Task and Assessment
Our official task for Week 6 was:
“Use NumPy and Pandas to explore and analyze a real-world dataset. Identify missing values, compute basic statistics, and generate insights.”
Assessment Criteria:
- Correctly loading and exploring data with Pandas
- Handling missing values appropriately
- Computing meaningful statistics
- Drawing logical insights from the results
It was challenging at first, especially dealing with missing data and large columns. But through consistent practice — and the Saturday support sessions — everything began to fall into place.
Weather Data Analysis Results
After cleaning and exploring the Beijing dataset, I extended the analysis across six global cities: Beijing, Brasília, Cape Town, Delhi, London, and Moscow.
Here are my summarized insights
Temperature
- Warmest: Brasília (avg 22.9°C) — tropical and consistent.
- Coldest: Moscow (avg 6°C) — long, cold winters.
- Most variable: Beijing and Delhi — wide seasonal swings.
Rainfall
- Wettest: Brasília (751 mm).
- Driest: Delhi (225 mm).
- Moderate: London, Cape Town, and Beijing.
Wind
- Windiest: Cape Town & London — suitable for wind energy.
- Highest Gusts: Delhi & Brasília — possible storm activity.
Data Quality
- Best: London (minimal missing data).
- Weakest: Moscow (missing precipitation).
Reflection: Growth Through Patience
Looking back, Week 6 wasn’t just about learning libraries — it was about resilience and understanding how I learn best. There were moments when concepts felt out of reach, but waiting for that Saturday class taught me the value of patience.
Sometimes, clarity takes time. When it came, it was worth every minute of confusion.
I now see data as more than rows and columns — it’s a story waiting to be told. And every dataset, no matter how messy, has something to teach.
Conclusion
Week 6 was the moment I transitioned from running code to understanding it. From counting brackets in NumPy to interpreting global weather data in Pandas, this week reminded me that learning is a journey, not a sprint.
As I move into Week 7, I’m more confident, more patient, and more excited to keep exploring the world of data — one dataset at a time.