Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
209 views
in Technique[技术] by (71.8m points)

python - Why does a string cause entire pandas DataFrame to be non-numerical?

If I create a pandas DataFrame using numerical values, this is reflected in the DataFrame. However, if the first element is a string, i.e. 'a', the entire DataFrame goes grey and all numbers in it are converted to strings, i.e. 3 becomes '3'. Why and how to retain datatype diversity?

import numpy as np
import pandas as pd

AA= pd.DataFrame(np.asarray([1,2,3]))
AA2 = pd.DataFrame(np.asarray(['a','b',3]))

The output is

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

First problem is is use np.asarray(['a','b',3] all data are converting to strings, objects.

AA2 = pd.DataFrame(np.asarray(['a','b',3]))
print (AA2.dtypes)
0    object
dtype: object

print (AA2[0].apply(lambda x: type(x)))
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: 0, dtype: object

If pass list get mixed data - numeric with strings:

AA2 = pd.DataFrame(['a','b',3])

print (AA2.dtypes)
0    object
dtype: object

print (AA2[0].apply(lambda x: type(x)))
0    <class 'str'>
1    <class 'str'>
2    <class 'int'>
Name: 0, dtype: object

But working with mixed values is problemtic, most numeric operations failed, so the best is avoid it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...