-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Team Formation as Bundle Recommendation #227
Comments
(Copied from an email sent from [email protected] to [email protected] on 12/28/2023 at 11:56 am) I've had a chance to go through some papers on DBLP to find implementations of bundle recommendation algorithms that had been published in top tier venues, that also had source code and datasets available. I was able to find 5, three of which I have been able to clone and run. The other two I was able to clone, but I do not yet have a GPU and their code was written to use CUDA on Nvidia cards so I have not yet run those. I was able to get in touch with Zhang Zhenning about SUGER. Zhang mentioned that there had been no additional work done on the repo since the publication, so, the code in Github is still the latest available. It ran with minimal modifications. I needed to update the hard-coded paths to the right locations on my machine, and it seemed OK. I have attached a spreadsheet with the details of the papers that I have reviewed and the repos. I think you had mentioned that the first steps would be to find papers, review them, clone and run the repos, then check in with you before going further? |
I've been having tons of trouble getting some of these to run. I've tried more and more RAM, CPUs, nodes and GPUs. Eventually I was able to get some of them to run, but I have only had two data sets complete on one program - BGCN with imdb and github data. BundleGT is saying there's an issue with the Top20 results, and CrossCBR and MIDGN are giving me index issues. I'm not 100% sure which area to focus on... getting one or two methods to work on all of the data sets, whether or not I should focus on getting an entire dataset to run or continue to focus on getting a slice of the dataset to run on all the algorithms, and I'm not sure how deep I should dig into the code/data issues. |
Hi Richard, @rkenny 1- Foremost, we need create a repo to include these methods in a clean/readable codebase/pipeline. You can create it in your github and I can transfer it to fani-lab or I can create an empty repo here and you push your code there. 2- We need to save the prediction of such methods and then evaluate them based on our own metrics. We can meet during this week or on Friday and I can explain more. 3- For dblp, the problem is with it's large number of experts and skills. We can filter them more. I will also explain this more in our meeting. |
I have created a repo at https://github.com/rkenny/4960A |
I have been having all kinds of trouble with dblp. I've been trying to resolve an issue with the bundle_item dataset (which is papers - authors, if my notes are accurate). When I try to load the entire dataset, or a subset of the dataset, into the models, I'm getting an error from scipy saying that the row index exceeds the matrix dimensions. I'm working on narrowing down the cause... I suspect my mapper has a bug or two. The dataset is huge, so, I'm mostly working on getting samples from it more quickly to speed up debugging time in the actual program. I'll be working on this over the weekend, and early Monday morning. Does it make sense for me to spend more of my time at this point working on improving the performance of the mapper/ETL tools for DBLP? Just want to make sure I'm going in a direction that makes sense. |
@rkenny |
I have been able to get IMDB to run on BundleGT, BGCN and CrossCBR. I'm working on MIDGN still. |
@rkenny |
I thought I was closer than it turned out to be... the data really doesn't make a lot of sense. Do you have time early in the week of the 17th-24th to discuss it? I'm concerned that I might be completely lost and not realizing it. |
No description provided.
The text was updated successfully, but these errors were encountered: