SAWarning when querying with SQLAlchemy in pandas df

I query my SQLAlchemy bound schema directly in pandas DataFrame

, and I get an annoying SAWarning from pandas

which I would like to refer to. Here's a simplified version.

class School(Base):
__tablename__ = 'DimSchool'

id = Column('SchoolKey', Integer, primary_key=True)
name = Column('SchoolName', String)
district = Column('SchoolDistrict', String)


class StudentScore(Base):
__tablename__ = 'FactStudentScore'

StudentKey = Column('StudentKey', Integer,    ForeignKey('DimStudent.StudentKey'), primary_key = True)
SchoolKey = Column('SchoolKey', Integer, ForeignKey('DimSchool.SchoolKey'), primary_key = True)
PointsPossible = Column('PointsPossible', Integer)
PointsReceived = Column('PointsReceived', Integer)

student = relationship("Student", backref='studentscore')
school = relationship("School", backref='studentscore')

      

I am asking for a date with words like this:

standard = session.query(StudentdScore, School).\
join(School).filter(School.name.like('%Dever%'))

testdf = pd.read_sql(sch.statement, sch.session.bind)   

      

And then get this warning:

SAWarning: Column 'SchoolKey' on table <sqlalchemy.sql.selectable.Select at 0x1ab7abe0; Select object> being replaced by Column('SchoolKey', Integer(), table=<Select object>, primary_key=True, nullable=False), which has the same key.  Consider use_labels for select() statements.

      

I am getting this error for every additional table (class) included in my connection. The message always refers to a foreign key.

Does anyone else run into this error and identify the root cause? Or did you just ignore it?

EDIT / UPDATE:

Handling Duplicate Columns in Pandas DataFrame Constructor from SQLAlchemy Join

These guys seem to be talking about a related issue, but they are using a different Pandas method to insert the dataframe and keep the duplicates, not delete them. Anyone have any thoughts on how to implement a similar styled function, but remove the duplicates as the request returns?

+3


source to share


1 answer


For what it's worth, here's my limited answer.

For the next SAWarning:

SAWarning: Column 'SchoolKey' on table <sqlalchemy.sql.selectable.
Select at 0x1ab7abe0; Select object> being replaced by Column('SchoolKey', Integer(), table=<Select object>, primary_key=True, nullable=False), which has the same key.  
Consider use_labels for select() statements.

      



This really tells you that there are columns with duplicate names, even if the columns are in separate tables. In most cases, this is harmless, since the columns are simple join keys. However, I have run into situations where tables contain duplicate names for different filled columns (i.e. Teacher table with name column and student table with name column). In these cases, rename the pandas framework with an approach like this , or rename the underlying database tables.

I'll keep an eye on this question and if anyone has a better one, I'll happily give an answer.

+8


source







All Articles