python Keras Deep NN code tabular categorical features: how to predict unseen in training data
Ngân sách $10-30 USD
1
use embedding layer for input layers : one hot for categorical values
2
provide code how to ignore new categorical values from data for prediction
for example new values should be encoded in one hot as all zeros
for example for used in train samples
abc -> 00001
cfr -> 00010
trvbn -> 00100
etc
not used in train
kljghkjlh -> 00000
ygtfrd-> 00000
u7y8uu -> 00000
3
deliver working example: data and Keras python code
data table should be big : millions of rows and more than 30 features
and each feature have at least 300 categories
4
for example use this idea
[login to view URL]
onehot_encoder = OneHotEncoder(sparse=False, handle_unknown='ignore')
or
[login to view URL]
BUT
1
and do not use slow solution like this (working with dataframes instead of arrays, not optimized for speed)
like in
[login to view URL]
2
do not use memory not efficient solution
meaning not dense data representation but spars data representation
since I do have big data - bid data table which takes a lot place in memory when hot encoded
Được trao cho:
I have all the skills you need i can develop the model you want . I am proficient in tensorflow and keras