Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matching word-frequency with word-ID in Imagesearch.py #20

Open
koreaccm opened this issue Dec 9, 2014 · 0 comments
Open

matching word-frequency with word-ID in Imagesearch.py #20

koreaccm opened this issue Dec 9, 2014 · 0 comments

Comments

@koreaccm
Copy link

koreaccm commented Dec 9, 2014

Hello, @jesolem
I'm truly thankful for PCV sources.
Recently, I found that there might be a problem of indexing in Imagesearch.py
I just want to confirm whether the original code is right.

if we look into querying function, we can notice query() -------> candidate_from_histogram() -------> candidates_from_word().

def candidates_from_word(self,imword):
      im_ids = self.con.execute( "select distinct imid from imwords where wordid=%d" % imword).fetchall()

Meanwhile, I think the value indexed to imword table is not word-id, but word-frequency.

def add_to_index(self, imname, descr):
      ...
      imwords = self.voc.project(descr)
      nbr_words = imwords.shape[0]

      # link each word to image
      for i in range(nbr_words):
          word = imwords[ i ]
          # wordid is the word number itself
          self.con.execute("insert into imwords(imid,wordid,vocname) values (?,?,?)",  (imid,word,self.voc.name))

So, it seems that word-id and word-frequency are compared. Isn't it wrong?
I think the add_to_index() should be fixed as comparison between word-id and word-id like below.

def add_to_index(self, imname, descr):
      ...
      imwords1 = self.voc.project(descr)
      imwords2 = imwords1.nonzero()[0]
      nbr_words = imwords2.shape[0]

      # link each word to image
      for i in range(nbr_words):
          word = imwords2[ i ]
          # wordid is the word number itself
          self.con.execute("insert into imwords(imid,wordid,vocname) values (?,?,?)",  (imid,word,self.voc.name))
          # store word histogram for image
          # use pickle to encode NumPy arrays as strings
          self.con.execute("insert into imhistograms(imid,histogram,vocname) values (?,?,?)", (imid,pickle.dumps(imwords1),self.voc.name))
@koreaccm koreaccm closed this as completed Dec 9, 2014
@koreaccm koreaccm reopened this Dec 9, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant