Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added_multiprocessing #26

Merged

Conversation

abhishekgupta368
Copy link
Contributor

#6 adding multiprocessing

Label: medium

Feature added:
I add multiprocessing in a file using pools

Copy link
Owner

@rajatkb rajatkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove header data added by spyder editor. We don't need those in this project.
  • Change implementation to use Process class not thread pools. Thread pools are not interruptable i.e cannot be scheduled when in I/O wait.
  • Do not bring change to code structure. You can create a sepparate utility class focused on Process based execution and then use that poool based execution to pass our function here along with args in kwargs fashion. This will allow us to isolate issues of Multiprocessing to one single file.

@abhishekgupta368 abhishekgupta368 force-pushed the add_multithreading@abhishekgupta368 branch from 3c6c9fa to 6829bb0 Compare March 6, 2020 10:17
Copy link
Owner

@rajatkb rajatkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • You are over complicating the implementation.
  • I will add an interface to be implemented, follow that it will help you to implement the functionality
  • have a habit of adding doc string to your code. I did not saw doc string for the new function and classes.

@rajatkb
Copy link
Owner

rajatkb commented Mar 7, 2020

@abhishekgupta368 I will provide an interface make sure to implement that.

@rajatkb rajatkb linked an issue Mar 8, 2020 that may be closed by this pull request
@rajatkb
Copy link
Owner

rajatkb commented Mar 8, 2020

@abhishekgupta368 discuss the implementation during meetup, please. It's not what we require.

@abhishekgupta368
Copy link
Contributor Author

No, there is problem in my git terminal due to which you got problematic file.
I am working on it.

@abhishekgupta368
Copy link
Contributor Author

this is a error, I got:

Traceback (most recent call last):
  File "C:/Users/Lenovo/Desktop/open_src_notify/Conference-Notify/Scrapper-Service/app.py", line 95, in <module>
    lambda : scrapper(  log_level = log_level,
  File "C:\Users\Lenovo\Desktop\open_src_notify\Conference-Notify\Scrapper-Service\process\mutiprocessing.py", line 13, in execute_process
    p.start()
  File "C:\Users\Lenovo\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\Lenovo\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Lenovo\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Lenovo\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Lenovo\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function <lambda> at 0x000001C85F8E5BF8>: attribute lookup <lambda> on __main__ failed
2020-03-09 11:56:48,340 - __main__ - INFO - Scrapping done from all Scrapper plugins

@rajatkb
Copy link
Owner

rajatkb commented Mar 9, 2020

Ya that's a legit error, because we are passing lambda functions which are not exactly pickable. i.e you cannot pickle them or covert to binary representation of objects. So hence the issue. Minor modifications should be able to help. You see if there is some hacky way of fixing this. 😅 I am looking into it too.

@abhishekgupta368
Copy link
Contributor Author

ok

Copy link
Owner

@rajatkb rajatkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • No use of logger for task-related instantiation, the logger information is provided for usage , use accordingly.
  • There is a merge conflict, resolve it ASAP

Copy link
Owner

@rajatkb rajatkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can you put some log messages specific to this class. Which will inform us about current stage of the processes running.
  • Log the stats
  • add doc string to the classes and function. use vscode extension docstring for generating the format.

Copy link
Owner

@rajatkb rajatkb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok , looks good. merging.

@rajatkb rajatkb merged commit c5759f5 into rajatkb:master Mar 13, 2020
@rajatkb rajatkb added gssoc20 GSSOC label for gscco20 tag medium GSSOC label for beginner tag labels Mar 13, 2020
@abhishekgupta368
Copy link
Contributor Author

hard work paid off

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gssoc20 GSSOC label for gscco20 tag medium GSSOC label for beginner tag
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improve ] introduce multiprocessing in main.py for Scrapper-Service
2 participants