Home Convert variable to multiple columns with Pandas
Reply: 0

Convert variable to multiple columns with Pandas

user1402
1#
user1402 Published in May 25, 2018, 8:54 am

I have a legacy datafile that contains data in the following format:

SURVEY  NUM TEMPORAL
WHS 1   Byz
WHS 1   Byz_Um
WHS 1   IAII
WHS 1   L_Isl
WHS 1   L_Rom
WHS 1   Mod
WHS 1   Nab
WHS 2   Byz
WHS 2   Mod
WHS 2   Unk
WHS 2   MP
WHS 3   Byz
WHS 3   Nab
WHS 3   LMP
WHS 3   UP
WHS 4   LMP
WHS 4   MP
WHS 4   UP
WHS 5   Byz
WHS 5   Unk
WHS 5   LMP

etc..

Essentially, the column "NUM" is a unique identifier that relates to a specific site, and the column "TEMPORAL" is a value associated with that site. For whatever reason, the original file repeates this over several lines for sites with mutltiple temporal occupations (this archaeological data). I would like to use Pandas to conver this to something like so:

SURVEY NUM  Byz Byz_Um IAII L_Isl LMP L_Rom Nab MP Mod Unk UP
WHS 1   1  1  1  1  1  0  0  0  0  0  1  0  0  0
WHS 2   1  0  0  0  0  0  0  0  0  1  1  0  0  1
WHS 3   1  0  0  0  0  0  0  0  1  1  1  0  1  0
WHS 4   0  0  0  0  0  0  0  0  0  0  0  1  0  1
WHS 5   1  0  0  0  0  0  0  1  0  0  0  0  1  0

Where a 1 is placed into a new column if that TEMPORAL period exists. I tried using df.pivot with "NUM" as the index and "TEMPORAL" as the columns, but that did not work. There are several thousand sites in this database, so doing it manually is not a possibility. Any ideas?

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.372087 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO