Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More useful to return unknown data when parsing frames, rather than "undefined". #52

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gyachuk
Copy link

@gyachuk gyachuk commented Jul 4, 2015

More useful to return unknown data when parsing frames, rather than "undefined".

@gyachuk
Copy link
Author

gyachuk commented Jul 4, 2015

Also, have you considered changing this over to using promises, rather than strung-together callbacks?

@aadsm
Copy link
Owner

aadsm commented Jul 4, 2015

Not really sure about this, why is it more useful to try to read the frame data as a string? Seems kind of a stretch, can you talk about your use case?

Regarding promises, I much prefer them to be honest, I've used Q quite extensively in the past years.
I didn't use it because it would add an extra weight to the library that I'm not convinced most people would benefit from. Also, it's pretty easy to promisify the two functions with a thin wrapper. Q even provides that functionality with Q.nfcall: http://documentup.com/kriskowal/q/

@gyachuk
Copy link
Author

gyachuk commented Jul 4, 2015

I went back and forth between getStringAt() and getBytesAt(). It finally came down to being able to read the first few bytes of returned data in the debugger, for my use case. I'm parsing a "PRIV" tag. I'd be happy with getBytesAt(), too. Either one is better than not returning the data at all.

@gyachuk
Copy link
Author

gyachuk commented Jul 4, 2015

I've not used Q, but have been using promises in jQuery for quite a while. They clean up a lot of callback nesting. I hear what you're saying about extra weight.

One question I have for you is about ID3._files. It seems that ID3.loadTags() will always overwrite the tags for some URL, so it isn't for caching. I'm guessing it's so you can access the tags by URL from within the callback. Any reason for not just passing the tags into the callback?

@aadsm
Copy link
Owner

aadsm commented Jul 4, 2015

Instead of doing that how about passing an optional function to loadTags that is called whenever the parser doesn't know how to read a specific frame?

It is for caching, you only need to call loadTags once and then getAllTags at your heart's desire. You should only call load again if for some reason think the server file changed.

Load/Get is a common pattern when dealing with async operations. In theory you could use only a Load function that uses the cached value or goes to the server but then you'd have two problems:

  • You'd need an extra flag or something to force it to load again in case you need it (clearing the cache basically).
  • You're bottlenecked by the async nature of the Load function, even when the value is cached.

@aadsm
Copy link
Owner

aadsm commented Jul 4, 2015

Btw, here's how you could promisify this with jQuery:

function getTags(filename, options) {
  var deferred = $.Deferred();
  var tags = ID3.getAllTags(filename);

  if (tags) {
    deferred.resolve(tags);
  } else {
    var newOptions = Object.create(options);
    newOptions.onError = function(reason) {
      deferred.reject(reason);
    };

    ID3.loadTags(
      filename,
      function onTagsLoad() {
        deferred.resolve(ID3.getAllTags(filename));
      }
      newOptions
    );
  }

  return deferred.promise();
}

This is based on: http://stackoverflow.com/questions/22519784/how-do-i-convert-an-existing-callback-api-to-promises

Disclaimer: I didn't run this code so how knows if it works :)

@gyachuk
Copy link
Author

gyachuk commented Jul 4, 2015

The reason I'm asking about _files is that I've got a stream of video data coming in, with ID3 data every second or so. I've wired up a dataReader for a buffer as follows, with the call to ID3.clearTags so that we don't keep these ID3 objects hanging around. Without that, it could easily become quite a memory hog. This may not be a very common usage.

  id3Updated: function(ID3Data) {
      var data = atob(ID3Data);
      var id3Tags;
      ID3.loadTags(data, function() {
          id3Tags = ID3.getAllTags(data);
          ID3.clearTags(data);
      }, {
              dataReader: function(str, fncCallback, fncError) {
                  fncCallback(new BinaryFile(str));
              },
              // I'm using this to make sure all possible frames are returned.
              // I know this isn't very efficient. It only needs to be calculated once.
              tags: Object.keys(ID3v2.frames).map(function(key) { return key; })
          }
      );

      return id3Tags;
  }

And the reason I was asking about promises was that it seems this would be natural use case, where you wouldn't need to keep anything in backing storage.

  id3Updated: function(ID3Data) {
      var data = atob(ID3Data);
      var id3promise = ID3.loadTags(data, {
              dataReader: function(str, fncCallback, fncError) {
                  // Might be clearer here to just return new BinaryFile(str),
                  // but I'm keeping with the existing idiom.
                  fncCallback(new BinaryFile(str));
              }
      });
      id3promise.then(function(id3Obj) {
          var id3Tags = id3Obj.getAllTags();
          return id3Tags;
      });
      id3promise.error(function() {
          return {};
      });
      return id3promise;
  }

In other words, the promise is returned from the loadTags() call, not from the getAllTags() call. With your code above, you still have to provide a callback to loadTags that calls getAllTags. By that time, the data has already arrived, so I'm not sure that a promise provides any real improvement.

I hope you're not taking this as any kind of criticism. I think this library is great. I'm just asking about possible paths for future development.

@aadsm
Copy link
Owner

aadsm commented Jul 5, 2015

Oh, I see your use case yeah.
However, if the loadTags returned a promise I would still maintain a cache system because I would want to have a way to get the tags of a specific file without having to "force" the consumer to store the promise they got from loadTags.

On the jQuery example, it's true that I still need to provide a callback, but that's the way to promisify a function. But from the consumer point of view it's irrelevant, this is just an abstraction. You don't ever need to deal with how it's implemented. You just need to add a jquery-id3.js file to your project with:

ID3.jLoadTags = function(filename, options) {
  var deferred = $.Deferred();
  var tags = ID3.getAllTags(filename);

  if (tags) {
    deferred.resolve(tags);
  } else {
    var newOptions = Object.create(options);
    newOptions.onError = function(reason) {
      deferred.reject(reason);
    };

    ID3.loadTags(
      filename,
      function onTagsLoad() {
        var tags = ID3.getAllTags(filename);
        ID3.clearTags(data);
        deferred.resolve(tags);
      }
      newOptions
    );
  }

  return deferred.promise();
}

and then use it like this in your logic:

id3Updated: function(ID3Data) {
  var deferred = $.Deferred();
  var data = atob(ID3Data);

  ID3.jLoadTags(data, {
    dataReader: function(str, fncCallback, fncError) {
      // Might be clearer here to just return new BinaryFile(str),
      // but I'm keeping with the existing idiom.
      fncCallback(new BinaryFile(str));
    }
  }).then(deferred.resolve, function() { deferred.resolve({}) });

  return deferred.promise();
}

I'm not really sure what you mean with "By that time, the data has already arrived, so I'm not sure that a promise provides any real improvement." though.

Oh no, I don't take it as criticism, I take it as valuable feedback :) I don't get feedback on the library as often as I would like so this is really helpful! There's a bunch of things I don't like about the current API (and internal architecture to be honest), this was based on a demo to show off loading raw bytes from an XHR, I should have thought about it better at the time but it is what it is now. I've been working on a new version on my "free" time, with unit tests and stuff. I'm glad it's helpful to you!

How about my suggestion to add an optional function to read unknown frames?

Btw, on your example, the tags can be just Object.keys(ID3v2.frames) which already returns an array with all keys.

@gyachuk
Copy link
Author

gyachuk commented Jul 5, 2015

Thanks for the note about Object.keys(). That's what I get for taking code from stackoverflow and blindly adapting it. :)

As for a function to read unknown frames, it might not be a bad idea to be able to override it. I'm not sure why you wouldn't want to make that the default, though. It might be a philosophical difference. I think I'd go for returning as much as possible, and returning less when explicitly asked.

In other words, I'd set it up so that by default, you'd return all tags, not just the ones from the default shortcuts (I'm talking about default behavior here. I know this can be overridden). I'd also return all available data for each tag, unless explicitly requested otherwise. That could be the inverse function of what I think you're talking about, which is one that skips over the data and returns "undefined", like your current implementation does.

And yes, your jLoadTags function would do exactly what I'd like. I guess it's another philosophical difference, but I still don't care much for keeping user data around in the ID3._files object. To me, it is pretty much keeping global data hidden behind the caller's back. BTW, the caller wouldn't be keeping the promise around, but instead the tags themselves. I guess I'd set it up so that the caller gets the tags and if they want to keep them, they put them somewhere. If they don't want them kept, they do nothing. The way the code is currently set up, it is the opposite. If the caller wants to keep the data around, they do nothing. If they just want to process it once and forget it, they have to explicitly clear the tags.

Finally, sorry for the confusion about my "By that time, the data has already arrived..." comment. It is directed to the situation where the getAllTags was returning a promise, rather than consuming one. It doesn't apply when the promise is being returned by loadData().

And thanks again for the great code.

@aadsm
Copy link
Owner

aadsm commented Jul 13, 2015

As for a function to read unknown frames, it might not be a bad idea to be
able to override it. I'm not sure why you wouldn't want to make that the
default, though. It might be a philosophical difference. I think I'd go for
returning as much as possible, and returning less when explicitly asked.

I understand the idea but in this case you're proposing to parse the data as a string even when it's not a string. This is the bit I don't agree with. I wouldn't mind passing the raw data though.

In other words, I'd set it up so that by default, you'd return all tags, not
just the ones from the default shortcuts (I'm talking about default behavior
here. I know this can be overridden). I'd also return all available data for
each tag, unless explicitly requested otherwise. That could be the inverse
function of what I think you're talking about, which is one that skips over
the data and returns "undefined", like your current implementation does.

Yeah, I agree with this, this should have been the default behaviour from the beginning.

And yes, your jLoadTags function would do exactly what I'd like. I guess
it's another philosophical difference, but I still don't care much for
keeping user data around in the ID3._files object. To me, it is pretty much
keeping global data hidden behind the caller's back. BTW, the caller
wouldn't be keeping the promise around, but instead the tags themselves. I
guess I'd set it up so that the caller gets the tags and if they want to
keep them, they put them somewhere. If they don't want them kept, they do
nothing. The way the code is currently set up, it is the opposite. If the
caller wants to keep the data around, they do nothing. If they just want to
process it once and forget it, they have to explicitly clear the tags.

As a consumer you shouldn't care how the library manages its own internal data. Implementing cache systems on I/O (or other type) bound functions is a really common pattern, there's even a design pattern addressing this use case, the Memoization.

I agree that the current implementation has a big issue, it doesn't have a limit on the cache size, or a way to configure this limit.
If the library doesn't implement their own cache system then the consumer will have to. I prefer to have this kind of boilerplate code in the library instead of asking all consumers to implement the same thing over and over. This is the rationale behind this design decision.

I really appreciate your comments!

@gyachuk
Copy link
Author

gyachuk commented Jul 13, 2015

Glad you're receptive to the comments.

I'm happy as a client to call clearTags() as soon as I'm done with the tags.

Sounds like the simple change to my PR is to return frameData.getBytesAt() rather than frameData.getStringAt().

I'd be OK with that. I think I said that way up on one of the first comments. :)

Problem though with that is that I'd really like to use the BinaryReader to get at shorts, ints and longs. But that doesn't work completely well, since strData comes in with typeof(strData) === "object", at least on Firefox. So getByte() never gets defined. And it is pretty widely used by BinaryReader. Any thoughts on how to address that?

Also, was wondering why you chose the Opera license, instead of the MIT license? And did you know that the Opera links in your LICENSE.md get a 404 these days. I can't even find the terms of the Opera license.

@aadsm
Copy link
Owner

aadsm commented Jul 20, 2015

Cool, I missed that comment though! To solve the getBytesAt() problem, how about returning a new BinaryFile that only handles the data returned by getBytesAt()? Add a .getBinaryFileWithRange function and use it.

There is no Opera license, it's BSD. The copyright of a big bulk of the code is owned by Opera because I was working there at the time and they were using the BSD license. Yeah, they completely changed dev.opera.com, they've dumped it on github https://github.com/operasoftware/devopera, but you can check it using the wayback machine: https://web.archive.org/web/20100426105813/http://dev.opera.com/licenses/bsd/.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants