Contains methods for transferring.
import os
We'll create collections of Embedding layers, which will be used to test our transfer methods.
emb_szs1 = ((3, 10), (2, 8))
emb_szs2 = ((2, 10), (2, 8))
embed1 = nn.ModuleList([nn.Embedding(ni, nf) for ni,nf in emb_szs1])
embed2 = nn.ModuleList([nn.Embedding(ni, nf) for ni,nf in emb_szs2])
embed1
Now, we'll create collections containing required metadata.
newcatcols = ("new_cat1", "new_cat2")
oldcatcols = ("old_cat2", "old_cat3")
newcatdict = {"new_cat1" : ["new_class1", "new_class2", "new_class3"], "new_cat2" : ["new_class1", "new_class2"]}
oldcatdict = {"old_cat2" : ["a", "b"], "old_cat3" : ["A", "B"]}
json_file_path = "../data/jsons/metadict.json"
with open(json_file_path, 'r') as j:
metadict = json.loads(j.read())
metadict
is a Dict
with the keys as the classes in dest. model's data, and value is another Dict
where mapped_cat
corresponds to the class in src model's data, along with information about how the classes map from dest. data to src data.
metadict
df = pd.DataFrame({"old_cat1": [1, 2, 3, 4, 5], "old_cat2": ['a', 'b', 'b', 'b', 'a'], "old_cat3": ['A', 'B', 'B', 'B', 'A']})
cats = ("old_cat2", "old_cat3")
embdict = extractembeds(embed2, df, transfercats=cats, allcats=cats, path="tempwtbson")
get_metadict_skeleton(df)
transferembeds_
Embeddings before transfer:
embed1.state_dict()
embed2.state_dict()
transfer_cats = ("new_cat1", "new_cat2")
transferembeds_(embed1, embdict, metadict, transfer_cats, newcatcols=newcatcols, oldcatcols=oldcatcols, newcatdict=newcatdict)
transfer_cats = ("new_cat1", "new_cat2")
transferembeds_(embed1, embed2, metadict, transfer_cats, newcatcols=newcatcols, oldcatcols=oldcatcols, oldcatdict=oldcatdict, newcatdict=newcatdict)
transfer_cats = ("new_cat1", "new_cat2")
transferembeds_(embed1, pathlib.Path("tempwtbson"), metadict, transfer_cats, newcatcols=newcatcols, oldcatcols=oldcatcols, newcatdict=newcatdict)
Embeddings after transfer:
embed1.state_dict()
embed2.state_dict()
os.remove("tempwtbson")