Feature Engineering
Create better features for your models.
Make better features = Better AI.
What is Feature Engineering?
Creating new features from existing data to improve model performance.
Like giving AI better hints!
Why Important?
Good features > Complex models
Simple model + Great features > Complex model + Bad features
Common Techniques
**1. Creating New Features**
```python # From date, create multiple features df['year'] = df['date'].dt.year df['month'] = df['date'].dt.month df['day_of_week'] = df['date'].dt.dayofweek df['is_weekend'] = df['day_of_week'].isin([5, 6]) ```
**2. Combining Features**
```python # House data df['price_per_sqft'] = df['price'] / df['sqft'] df['total_rooms'] = df['bedrooms'] + df['bathrooms'] ```
**3. Binning**
```python # Age groups df['age_group'] = pd.cut(df['age'], bins=[0, 18, 35, 50, 100], labels=['child', 'young', 'middle', 'senior']) ```
**4. Encoding Categories**
```python # One-hot encoding df_encoded = pd.get_dummies(df, columns=['city']) # Miami → [1, 0, 0] # Austin → [0, 1, 0] # Denver → [0, 0, 1] ```
Real Example - House Prices
Original features: size, bedrooms
New features: - price_per_sqft = price / size - has_garage = bedrooms > 2 - age_of_house = current_year - built_year
Feature Scaling
```python from sklearn.preprocessing import StandardScaler
scaler = StandardScaler() X_scaled = scaler.fit_transform(X) ```
Feature Selection
Remove features that don't help:
```python from sklearn.feature_selection import SelectKBest
selector = SelectKBest(k=10) X_selected = selector.fit_transform(X, y) ```
Remember
- Domain knowledge helps - Try different combinations - Remove correlated features - Scale numeric features