c# - Check if one collection of values contains another -
suppose have 2 collections follows:
collection1: "a1" "a1" "m1" "m2"
collection2: "m2" "m3" "m1" "a1" "a1" "a2"
all values string values. want know if elements in collection1 contained in collection2, have no guarantee on order , set may have multiple entries same value. in case, collection2 contain collection1 because collection2 has 2 a1's, m1 , m2. theres obvious way: sorting both collections , popping off values find matches, wondering if there's faster more efficient way this. again initial collections have no guarantee on order or how many times given value appear
edit: changed set collection clear these aren't sets can contain duplicate values
yes, there faster way, provided you're not space-constrained. (see space/time tradeoff.)
the algorithm:
just insert elements in set2 hashtable (in c# 3.5, that's hashset<string>), , go through elements of set1 , check if they're in hashtable. method faster (Θ(m + n) time complexity), uses o(n) space.
alternatively, say:
bool issuperset = new hashset<string>(set2).issupersetof(set1);
edit 1:
for people concerned possibility of duplicates (and hence misnomer "set"), idea can extended:
just make new dictionary<string, int>
representing count of each word in super-list (add 1 count each time see instance of existing word, , add word count of 1 if it's not in dictionary), , go through sub-list , decrement count each time. if every word exists in dictionary and count never 0 when try decrement it, subset in fact sub-list; otherwise, had many instances of word (or didn't exist @ all), it's not real sub-list.
edit 2:
if strings big , you're concerned space efficiency, , algorithm works (very) high probability works you, try storing hash of each string instead. it's technically not guaranteed work, probability of not working pretty darn low.
Comments
Post a Comment