Question and Clarification about Queue Manager



  • Hi gang! I have a few questions about queue manager:

    1: Is this the hub for making a true render farm?
    2: How do I know if the render servers are actually rendering?

    My main system is a Windows 7 i5 64 bit - 8 gb ram - Quadro 4800 graphics card. My render servers are (x2) Dell Poweredge 8 core xeon (for a total of 16 cores) 32 gb ram (total 64)

    I'm used to the VUE render farm that shows you exactly what's going on with each system Any help you could give would be great.

    Thanks!

    Phil


  • Poser Ambassadors

    [1] Yes, Queue Manager is indeed the means by which you would network render in Poser Pro. I have four workstations and fifteen servers networked together and run them all together for animations; you can install Queue on as many machines as you can network together :D. It will distribute animation frames among the render slaves, or distribute entire renders from a batched list. Note that it will not network a single render by splitting it up amongst several render slaves (I have requested this for P12Pro).

    [2] On your render slaves, install & run DLM (DownLoad Manager). Run DLM, inserting the Queue Manager serial (not the Poser serial). DLM should download Queue. Install Queue. Launch Queue; it will challenge you for name and serial; enter those (Queue serial, not Poser serial).
    Check your firewall on each render slave, and make sure Queue Manager has firewall permissions, both in and out, so that it can talk to your workstation's Queue Manager. Cmmand port is 4418; discovery port is 11523.

    When you send an animation (or a batch of several still frame renders) to Queue on your workstation, the workstation's Queue should auto-launch.
    The master Queue (on your workstation) should show availability pings from all online servers in the bottom window, indicating "I am here if you need me" for each slave.
    Each slave's Queue should repeatedly show discovery query from 192.168.1.100 or similar; this shows that the slave's Queue is getting an "Are you there?" message from the workstation.

    Once an animation is sent to Queue, there will be a lapse of time while Poser gets things ready; then, you'll see a flurry of notices in the master's bottom window about 192.168.1.119 available... file____.obj sent... file___.jpg sent..."

    Go to each slave (switch among them using a KVM switchbox) and one at a time, the slaves will begin to show activity messages in their Queue's bottom window about receiving those OBJs and JPGs. They will show a "job" in the upper window telling you which frame they are handling, and the percentage of progress accomplished.

    This is the Queue UI from Athena, a render slave:
    0_1499471500699_Queue Manager UI - slave.PNG
    This tells me that Athena is 89.9% done with her assignment, which is frame 119.

    This is the Queue UI from my workstation Urania:
    0_1499471670149_Queue Manager UI - master.PNG
    It shows that the frame 7 job has been completed, and that the frame 119 job is in progress. Note that progress shows as 0% until Athena returns the finished EXR to Urania.



  • On my master system I see exactly that with the exception of "remoted frame completed"...I get no reference to the remotes. Also, if I turn off the option for the host computer to render, then nothing happens. I have to have the host switched "on" for the rendering to start. I have QM installed on the slaves, but I don't think they are being activated. When I look at the information on the slaves it says "queue procs available: 1".

    Is there a black somewhere? Not sure what's going on.


  • Poser Ambassadors

    There does seem to be a block, and it is usually the firewall. See if you can open the security software's firewall section and look at Queue Manager's permissions. It must have in/out permissions on ports 4418 and 11523.

    "queue procs available" will show on the slave units to indicate that they are ready for a job.

    It is not a problem to set the workstation's Queue to not process jobs locally; that leaves your workstation free to do other Poser work. Mine is set that way right now.

    You won't see "remote frame completed" (in the master's Queue UI) until it receives a finished job. Mine shows that because the remote Titania finished her assignment and returned the finished EXR to the workstation Urania.

    Look at the Queue UI on your remotes; do they show "discovery query from 192.168.1.100" ? (100 is the internal network address of my workstation Urania)
    If you see such a discovery query notice on each remote, that means that the remotes "see" the workstation. If you don't see discovery queries on each remote, check those remotes' firewall permissions. Queue Manager needs both in and out permissions, at least on ports 4418 and 11523.

    Look at the Queue UI on your workstation; does it show something like
    Available message from 192.168.1.119 (--.119 is Athena)
    for each of your remotes? If you see this, then your workstation's Queue "sees" the remote. If you don't see this, check firewall permissions.
    0_1499474149866_Queue remote available 2.PNG

    All of your machines should have document sharing enabled, and be joined in a common homegroup.
    When you open My Computer, do you see the remotes in the network?
    0_1499473690194_Explorer showing computers on the local network.PNG



  • I get NOTHING that says "sending file"....lol. I actually turned the firewalls off because I thought that perhaps there was an issue there. Now, my slaves are running server 2012 and I'm running Windows 7 on my host..could that be an issue? When I render in VUE they (the slaves) are seen, but not in Poser Queue.

    When I check the network the slaves are there - I see them. I am not home right now so I'll have to check the UI later tonight. I do know that I see Cmmand port is 4418; discovery port is 11523 on both the host and the slaves.

    Do you think it could still be a firewall issue?


  • Poser Ambassadors

    Turn the firewall back on!

    It will only be a matter of permissions for Queue, if it's the firewall. I have not tried to run Queue (or Poser) on Server 2012; my remotes all have OEM Win7Pro licenses, and the workstations are either Win7Pro or Win7Ultimate. @shvrdavid might know if Server2012 is an issue.

    Queue on your machines has the ports right, but that doesn't mean that the firewall is allowing Queue to use those ports, so checking the firewall for permissions for Queue is still necessary.

    Do your remotes show "discovery query from 192.168.1.xxx"?

    Does your workstation's Queue show "available message from 192.168.1.xxx"?



  • LOL...I WILL - I WILL! :-) I've never seen "available message from 192.168.1.xxx". All I've seen on the remotes is the "Procs Available: 1". I wonder if I need to switch the OS TO 7 so that it can properly join the homegroup of my host - then again - I should probably just learn Server 2012..lol

    So there is obviously a communication issue somewhere...BTW...I don't know if this matters, but I'm running the newer CAT 6 cables.


  • Poser Ambassadors

    Remotes will show "procs available: 1" even if disconnected from the network.

    If you haven't seen "available message from 192.168.1.xxx", then either your remotes aren't getting out, or your workstation is blocking incoming message (or both). Check firewall permissions.

    Vice-versa if the remotes aren't showing "discovery query from 192.168.1.xxx" (your workstation's address). Check firewall permissions.

    If you're working in Poser and not currently using Queue, the master will still show "available" messages from all remotes, and each remote will likewise every few seconds pop up another discovery query notice.

    You can see these if you open the Queue log
    C:\Users\admin\AppData\Local\Temp\Queue Manager\11\log.txt

    All of my stuff is hardwired with Cat6 through a Gigabit switch; you're well set there.



  • Ok - well - when I get home in a few hours I'll check the firewall. I keep the Queue logs open to see if there are any changes. I'll take screen shots as well to show you what's going on. BTW - I want to thank you for your help.

    I DO see this on my host: "C:\Users\admin\AppData\Local\Temp\Queue Manager\11\log.txt"

    I have the queue running now and I remember the host having a message saying "Procs Available: 0"


  • Poser Ambassadors

    The host (master, workstation) says zero procs available simply because you have it set to not process jobs locally; that's correct, no problem there. It means the workstation is not available for network render jobs.

    Periodically refresh your windows Explorer view of the Queue log because it is constantly being added to as time passes.

    And look at the Queue log on the remotes, also. Look for any discovery query from 192.168.1.xxx and if you don't see that, then you have a communication blockage. Check firewall permissions.



  • Hmmmm...interesting...I do have the host set to process locally....Hmmmm. Sounds like I need to check firewalls and refresh the systems.

    I want to say that I have seen "discovery query" on the remotes for a split second. Like I said, I'll take screen shots to show you if that's okay.


  • Poser Ambassadors

    Aha. If you have the master set to process jobs locally, then it should show 1 process available. Poser, FFRender64, and Queue are separate executables; they must have firewall permissions to speak to each other. It may be that Poser can't even send a job to Queue on your machine due to firewall blockage.

    Discovery query showing on the remotes is a good sign; yes, it's fleeting, as the window keeps scrolling. you can peruse the Queue log text file at your leisure, though.

    Screenshots are good; if you don't have sufficient forum privileges yet, then just copy/paste from the Queue log text file.



  • Ok - you've given me a lot to think about and to investigate. I'll be home in about 3 hours so I can look into all of that. But if the firewalls are turned off wouldn't that then allow the permissions?


  • Poser Ambassadors

    Yes, that should remove any communication blocks (if done on both master and slave unit).

    You may have a permission block completely within the master, if Poser can't talk to Queue. But that is also a firewall issue.



  • Keeping my fingers crossed. Thanks for everything. I'll let you know in a few hours.

    Thanks



  • Ok - So I turned my firewalls back on and made sure that the Render Queue was allowed access. I have yet to restart my systems, but below is an image from the HOST computer:

    0_1499492935850_render.jpg

    The SLAVE systems are still saying "Discovery query from 192.168.x.x (port4418). 1 Procs available.

    I'm at a loss.....



  • Wild guess here but a problem I ran onto at the college was that a couple of the computers were in a different workgroup. The addresses were all correct all being on the same subnet but the work group name was different. It has been a while since I had that issue and I do not remember all the details, but I believe the remotes were seeing the message asking if they were available but could not send a response back. Since you are running two different OS's it maybe a matter of the workgroup. Can you ping the remotes from your main system? Although thinking about some more I believe they pinged just wouldn't talk.



  • @richard60 Hi Richard, I actually had that same thought and wondered if it would just be easier to switch the remotes to the same OS as the Host system. Everything shows on the network "tree", but I don't see where to add to a homegroup on the slaves. I'll attempt to ping them. Thank you!



  • @richard60
    Ok...I pinged one of the remotes. It said that 4 packets were sent and 4 were received...so it's reading it. But I'm wondering still if it's a homegroup linkage issue. Could it also be getting hung up at the router? I use VUE and I can send images to the slaves all the time - so they are reading the system. Only reason I'm NOT using Vue at this time is because it crashes when I attempt to render Poser with BVH files....so I'm rendering in Firefly.



  • Ok so after a bit of research, Windows Server 2012 can't connect to a homegroup - it's a windows 7 and newer feature. But is a homegroup the same as a work group? Because on the remote side, I can connect them to a work group - which I thought I had already done. Grrrr.