Python Pandas: How to Copy Columns Between DataFrames

Copying one or more columns from one Pandas DataFrame to another is a frequent operation when merging datasets, creating new features, or restructuring data. Pandas offers several ways to achieve this, but it's crucial to understand how DataFrame indexes affect the alignment of data during these operations to avoid unexpected NaN values.

This guide explains how to copy columns between DataFrames using direct assignment, the .copy() method, and techniques to handle index misalignments.

The Goal: Transferring Column Data

We want to take one or more columns from a source DataFrame (df1) and add them as new columns (or overwrite existing ones) in a target DataFrame (df2).

Example DataFrames

import pandas as pd
import numpy as np # For NaN example

# Source DataFrame
df1_data = {
    'ID': [101, 102, 103, 104],
    'Product_Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'Category': ['Electronics', 'Accessory', 'Accessory', 'Electronics'],
    'Sales_Q1': [150, 2000, 750, 3000]
}
df1 = pd.DataFrame(df1_data)
# df1 uses default 0-based index: 0, 1, 2, 3

# Target DataFrame
df2_data = {
    'Employee': ['Alice', 'Bob', 'Charlie', 'David'],
    'Department': ['Sales', 'IT', 'Sales', 'Marketing']
}
df2 = pd.DataFrame(df2_data)
# df2 also uses default 0-based index: 0, 1, 2, 3

print("Original df1:")
print(df1)
print()

print("Original df2:")
print(df2)

Output:

Original df1:
    ID Product_Name     Category  Sales_Q1
0  101       Laptop  Electronics       150
1  102        Mouse    Accessory      2000
2  103     Keyboard    Accessory       750
3  104      Monitor  Electronics      3000

Original df2:
  Employee Department
0    Alice      Sales
1      Bob         IT
2  Charlie      Sales
3    David  Marketing

Method 1: Direct Assignment (Basic Column Copying)

This is the most straightforward way when the DataFrames have aligning indexes or you intend to align based on the existing index.

Copying a Single Column

Assign the Series from df1 to a new column name in df2.

import pandas as pd

df1_example = pd.DataFrame({
    'ID': [101, 102, 103, 104], 'Sales_Q1': [150, 2000, 750, 3000]
})
df2_example = pd.DataFrame({
    'Employee': ['Alice', 'Bob', 'Charlie', 'David']
})

# ✅ Copy 'Sales_Q1' from df1_example to df2_example as a new column
df2_example['Q1_Sales'] = df1_example['Sales_Q1']

print("df2 after copying 'Sales_Q1' as 'Q1_Sales':")
print(df2_example)

Output:

df2 after copying 'Sales_Q1' as 'Q1_Sales':
  Employee  Q1_Sales
0    Alice       150
1      Bob      2000
2  Charlie       750
3    David      3000

note

Pandas aligns the data based on the index. Since both df1 and df2 have the default 0-based index here, the copy is row-by-row.

Copying Multiple Columns

Select multiple columns from df1 and assign them to df2.

import pandas as pd

df1_example = pd.DataFrame({
    'ID': [101, 102, 103, 104],
    'Product_Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'Category': ['Electronics', 'Accessory', 'Accessory', 'Electronics']
})
df2_example = pd.DataFrame({
    'Region': ['North', 'South', 'East', 'West']
})

# ✅ Copy 'Product_Name' and 'Category' from df1_example to df2_example
columns_to_copy = ['Product_Name', 'Category']
df2_example[columns_to_copy] = df1_example[columns_to_copy]
# Or more directly:
df2_example[['Product_Name', 'Category']] = df1_example[['Product_Name', 'Category']]

print("df2 after copying multiple columns:")
print(df2_example)

Output:

df2 after copying multiple columns:
  Region Product_Name     Category
0  North       Laptop  Electronics
1  South        Mouse    Accessory
2   East     Keyboard    Accessory
3   West      Monitor  Electronics

note

You must use double square brackets df1[['col1', 'col2']] to select multiple columns as a DataFrame for assignment.

Important: Use Bracket Notation `[]`, Not Dot Notation `.` for Assignment

When creating new columns in df2, always use bracket notation df2['NewColumn'] = .... Using dot notation (df2.NewColumn = ...) can sometimes work for accessing existing columns but is not reliable for creating new ones and can lead to warnings or incorrect behavior (e.g., creating an attribute instead of a DataFrame column).

# Correct:
df2['New_Col'] = df1['Source_Col']

# Incorrect for creating new columns:
df2.New_Col = df1.Source_Col # Avoid this for assignment

Method 2: Using `DataFrame.copy()` (for Subsets or Independent Copies)

If you want to copy a subset of df1 into a new DataFrame df2 (or overwrite df2 completely), you can select the desired columns from df1 and then use the .copy() method.

import pandas as pd

df1_example = pd.DataFrame({
    'ID': [101, 102, 103, 104],
    'Product_Name': ['Laptop', 'Mouse', 'Keyboard', 'Monitor'],
    'Category': ['Electronics', 'Accessory', 'Accessory', 'Electronics'],
    'Sales_Q1': [150, 2000, 750, 3000]
})

# Select desired columns from df1
columns_to_extract = ['Product_Name', 'Sales_Q1']
df1_subset = df1_example[columns_to_extract]

# ✅ Create df2 as a copy of this subset
df2_from_copy = df1_subset.copy()
# df2_from_copy will be an independent copy; changes to df1_subset or df1 won't affect it.

print("New df2 created from a copy of df1 subset:")
print(df2_from_copy)

Output:

New df2 created from a copy of df1 subset:
  Product_Name  Sales_Q1
0       Laptop       150
1        Mouse      2000
2     Keyboard       750
3      Monitor      3000

note

df.copy(deep=True): Creates a deep copy. Changes to the copy won't affect the original. deep=True is the default.

Handling Index Alignment: The Cause of `NaN` Values

A common pitfall when copying columns is getting NaN (Not a Number) values in the target DataFrame. This happens if the indexes of df1 and df2 do not align for the rows being assigned. Pandas tries to match rows based on their index labels during assignment.

When Indexes Differ

import pandas as pd
import numpy as np

df1_custom_idx = pd.DataFrame({
    'Value_A': [10, 20, 30, 40]
}, index=['a', 'b', 'c', 'd']) # Custom string index

df2_numeric_idx = pd.DataFrame({
    'Property_X': [100, 200, 300, 400]
}, index=[0, 1, 2, 3]) # Default numeric index

print("df1_custom_idx:")
print(df1_custom_idx)
print()
print("df2_numeric_idx:")
print(df2_numeric_idx)
print()

# Attempt to copy 'Value_A' to df2_numeric_idx
df2_numeric_idx['Copied_Value_A'] = df1_custom_idx['Value_A']

print("df2 after attempting copy with misaligned indexes:")
print(df2_numeric_idx)

Output:

df1_custom_idx:
   Value_A
a       10
b       20
c       30
d       40

df2_numeric_idx:
   Property_X
0         100
1         200
2         300
3         400

df2 after attempting copy with misaligned indexes:
   Property_X  Copied_Value_A
0         100             NaN
1         200             NaN
2         300             NaN
3         400             NaN

note

Since there are no matching index labels, Pandas can not align the data and fills the new column in df2 with NaN.

Solution 1: Homogenize Indexes

If the rows correspond logically but have different index labels, you can make the target DataFrame's index match the source's index before the copy.

import pandas as pd
import numpy as np

df1_custom_idx = pd.DataFrame({'Value_A': [10,20,30,40]}, index=['a','b','c','d'])
df2_numeric_idx_copy = pd.DataFrame({'Property_X': [100,200,300,400]}, index=[0,1,2,3])


# ✅ Make df2's index match df1's index
# WARNING: This assumes df2 has the same number of rows and logical order as df1.
df2_numeric_idx_copy.index = df1_custom_idx.index

df2_numeric_idx_copy['Copied_Value_A'] = df1_custom_idx['Value_A']
print("df2 after homogenizing index and copying:")
print(df2_numeric_idx_copy)

Output:

df2 after homogenizing index and copying:
   Property_X  Copied_Value_A
a         100              10
b         200              20
c         300              30
d         400              40

note

This solution only works if the number of rows is the same and the order implies a one-to-one mapping.

Solution 2: Bypass Index Alignment with `.to_numpy()` or `.values` (Recommended if Indexes Should Differ)

If you want to copy the column data based purely on its order (position), ignoring the index labels, convert the source column(s) to a NumPy array first using .to_numpy() (preferred) or the .values attribute. Assigning a NumPy array to a DataFrame column bypasses Pandas' index alignment.

import pandas as pd

df1_custom_idx = pd.DataFrame({'Value_A': [10,20,30,40]}, index=['a','b','c','d'])
df2_numeric_idx_copy2 = pd.DataFrame({'Property_X': [100,200,300,400]}, index=[0,1,2,3])

# ✅ Copy using .to_numpy() to bypass index alignment
df2_numeric_idx_copy2['Value_A_numpy'] = df1_custom_idx['Value_A'].to_numpy()

print("df2 after copying with .to_numpy():")
print(df2_numeric_idx_copy2)
print()

# For multiple columns:
df1_multi_cols = pd.DataFrame({'A': [1,2], 'B': [3,4]}, index=['x','y'])
df2_target = pd.DataFrame({'C': [5,6]}, index=[0,1])
df2_target[['New_A', 'New_B']] = df1_multi_cols[['A', 'B']].to_numpy()
print("Multiple columns with .to_numpy():\n", df2_target)

Output:

df2 after copying with .to_numpy():
   Property_X  Value_A_numpy
0         100             10
1         200             20
2         300             30
3         400             40

Multiple columns with .to_numpy():
    C  New_A  New_B
0  5      1      3
1  6      2      4

note

df1['Column'].to_numpy(): Converts the Series to a NumPy array, discarding index information.
df1['Column'].values: Similar effect, but .to_numpy() is generally preferred in modern Pandas. This assumes the source and target DataFrames have the same number of rows if you're assigning to an existing column or creating new columns for the entire length of the target.

Conclusion

Copying columns between Pandas DataFrames is a fundamental operation:

Direct assignment (df2['NewCol'] = df1['SourceCol'] or df2[cols_list] = df1[cols_list]) is the simplest method when indexes align or are intended to align. Always use bracket notation [] for assigning to new columns.
Use df_subset.copy() if you are creating a new DataFrame from a subset of another.
To avoid NaN values due to misaligned indexes:
- Either homogenize the indexes (df2.index = df1.index) if the DataFrames correspond row-by-row.
- Or, more commonly and robustly, bypass index alignment by assigning the .to_numpy() (or .values) representation of the source column(s) to the target DataFrame if you want to copy based on position/order.

Understanding Pandas' index alignment behavior is key to successfully copying columns without introducing unintended missing values.

The Goal: Transferring Column Data​

Example DataFrames​

Method 1: Direct Assignment (Basic Column Copying)​

Copying a Single Column​

Copying Multiple Columns​

Important: Use Bracket Notation [], Not Dot Notation . for Assignment​

Method 2: Using DataFrame.copy() (for Subsets or Independent Copies)​

Handling Index Alignment: The Cause of NaN Values​

When Indexes Differ​

Solution 1: Homogenize Indexes​

Solution 2: Bypass Index Alignment with .to_numpy() or .values (Recommended if Indexes Should Differ)​

Conclusion​

Table of Contents

The Goal: Transferring Column Data

Example DataFrames

Method 1: Direct Assignment (Basic Column Copying)

Copying a Single Column

Copying Multiple Columns

Important: Use Bracket Notation `[]`, Not Dot Notation `.` for Assignment

Method 2: Using `DataFrame.copy()` (for Subsets or Independent Copies)

Handling Index Alignment: The Cause of `NaN` Values

When Indexes Differ

Solution 1: Homogenize Indexes

Solution 2: Bypass Index Alignment with `.to_numpy()` or `.values` (Recommended if Indexes Should Differ)

Conclusion