Coding for Multiple Threads technique?

Started by Sengin, June 01, 2008, 01:07:08 AM

Previous topic - Next topic

Sengin

So I've got a question about a possible technique or idea using threads, but don't know how what it's called or how to implement it.  Say you have a small function that you can run in parallel.  So you create a thread for it.  It finishes its task, but instead of wanting it to destruct, you want to suspend it for an indefinite amount of time.  Then when you need its help again (say to spilt up the time initializing a huge array or processing the elements in a huge array), you wake it up, it jumps to the beginning of a function which does something (say in this case helps initialize an array), and when it finishes it gets suspended again.  So it's like this worker you can call to lend an extra set of hands whenever you need it to.  This would improve efficiency because each time you need a little help you aren't constructing and destrucing a thread and all the initialization of each one, for just a small service.  With all the setup, it wouldn't really improve much at all.  Does anyone know what this is called or how to implement it (using the Boost library would be the best solution, but I'll take one using standard Windows functions if you got it)?  The only way I've come up with is to put everything you'd want it to do in the same function, and just have the thread suspend itself after each little bit of code, and have the main thread resume it when necessary, but I know it's bad design and relies on a global boolean value...  If you don't know what I mean, maybe the code below will help explain (but just think of it not all in the same function).  I've been told it can be done (but not how) and that it might be called a worker thread, but search results have turned up nothing.  Thanks for any help, even if it's only the name of the idea of doing something this way.

#include <windows.h>

DWORD WINAPI doThreadProcessing(void* param);
bool stop = false;
HANDLE thread;
const int arrLength = 100000;
int arr1[arrLength];

int main(int argc, char** argv)
{
int i = 0;
unsigned long int threadID;

thread = CreateThread(NULL, 0, doThreadProcessing, (void*) i, 0, &threadID);
//I know the thread has to have THREAD_SUSPEND_RESUME in a security attribute

int x;
for(x = 0; x < arrLength; x+=2)
{
arr1[x] = 0;
}

//do some stuff......

while(!stop)
{
Sleep(10);
}
stop = false;
ResumeThread(thread);

for(x = 0; x < arrLength; x+=2)
{
if(arr1[x] >= 5)
arr1[x] = 5;
}

CloseHandle(thread);
return 0;
}

DWORD WINAPI doThreadProcessing(void* param)
{
int y;
for(y = 1; y < arrLength; y+=2)
{
arr1[y] = 0;
}

stop = true;
SuspendThread(thread);

for(y = 0; y < arrLength; y+=2)
{
if(arr1[y] >= 5)
arr1[y] = 5;
}

SuspendThread(thread);

//...etc... for small tasks that wouldn't really benefit from
//contructing and destructing a new thread each time

return 0;
}

Matt

#1
That's right, this would be called a "worker thread". Worker threads are sometimes organised by a thread pool. There appears to be a good set of links to articles in the Wikipedia entry for the "thread pool" pattern.

http://en.wikipedia.org/wiki/Thread_pool_pattern

Read as much as you can on the subject, even if it's related to another language such as Java.

Depending on your goals, you might not need a thread pool as such.

It could work like this. Each worker thread runs a loop. Inside the loop it waits while there are no tasks in the queue, and once a task becomes available it finds out what the task is and performs it. When it has finished the task it either grabs another task and performs that or waits again if there are no tasks. Terragen has a number of worker threads which ask the renderer for the next task (bucket/tile) whenever they have finished their current task. It is possible to keep this fairly simple because a render usually takes a long time, so it's OK for the threads to terminate when there are no tasks left to perform. Generally though, the worker thread just waits while there are no tasks in the queue.

To wait until there are tasks to perform you would wait on what is known as a condition variable. The worker thread is able to wait (sleep) until another thread (e.g. the main thread) calls notify for the condition variable which the worker thread was waiting on. Condition variables are explained here. There is also example code with Boost Threads if you decide to use that. For me the biggest hurdle to using Boost is the complicated build setup, but it's mostly a one-off and is worth it in the long run. Boost is useful for many other things too.

Matt
Just because milk is white doesn't mean that clouds are made of milk.

Sengin

#2
Hmmm....interesting.  Thank you Matt!

Oh, do you happen to know if there's a way in C++ to call a function without pushing the registers first?  So instead of, say,
push ecx
push edx
call function1
pop edx
pop ecx


it would just be
jmp <address of function 1>

The reason I ask is because I'm wondering if it's possible to implement what I'm thinking by creating a thread, processing a function (no parameters, would essentially be to process the odd-numbered elements in an array), waiting on a condition variable, when given a signal to proceed, it grabs the next function to process by grabbing a function pointer, and then at the end of that function it waits on a condition variable, etc...  With this approach, a stack is not needed (or wanted) because there are no registers to push because all I'll need the thread to do is to execute some data, not call functions (so I won't need a stack pointer, frame pointer, and any parameters I may need could be global and each function will know which globals to grab).  Plus the function will never execute a return instruction, so the stack will never decrease and will eventually run out of room.  Or is that not available in C++ and I'd have to modify the assembly?

Thanks.

Matt

I have no idea, sorry. I'm not sure why this is a priority though. Multi-threaded development is tricky enough without getting bogged down in low level details like this before you have even started. I would get the high level algorithms running correctly first.

Matt
Just because milk is white doesn't mean that clouds are made of milk.

Sengin

It's not really a priority, I've just been really interested in multi-threading lately.  I kept coming up with ideas that may or may not work in improving efficiency (as much as possible, as in down to the pushing and popping of registerse) and couldn't find anything related online (it's like an itch you can't scratch but with curiosity).  But I guess a small thread pool with one or two threads would probably suit my needs.

Thanks!

latego

Multithreading is not difficult provided that you get ready for it.

Troubles arise when you have different threads accessing the same data so... the first line of defence is not to access common data. Istanciate your data as much as possibile on the stack and on the heap and access it from only one thread. Obviously, you cannot do this all the time so... next step is to make global data as mush as possible read-only. The last step is to protect data which has to be accessed from different threads with critical sections.

Threads should communicate as little as possible. Usually, multithread apps have worker threads which do the hard work and one or more controlling threads which manage the whole thing. The easiest way is to have worker threads getting work job from a dedicated FIFO queue and put the results in another FIFO queue. The controlling thread has only to check for the shortest input queue, add a work unit to it and then scan the output queues for one with at least one result.

Getting to raytracing, usually the process requires a phase in which you build your world, create fast access structures (octatrees and the like) and then render pixels one by one. The natural subdivision is then to use one thread to perform the global work phase and then unleash raytracing threads, for example parallelizing the process to 1 render line per thread. If you did your homework, the renderer is a class which receives the world and the indication of which line to render, so the worker thread is extremely simple: at the beginning, instanciate the renderer and get the world "pointer", then wait on the input queue for a line index, wake up, render the line and put the RGB results in the output queue. As you can see, a clean object oriented model maps into a multithreadable architecture very cleanly. As a real example, POV Ray scale almost perfectly linearly on 8 cores (POV is a darling for multi core benchmarkers).

Bye!!!

P.S.: I started my multithreader carrier in 1990, using a multiprocessor system which had, in its largest configuration, 36 worker processors (80826 at 10 MHz) and the task was to create a system (for the United States Postal Service) able to read the address on letters while they were handled, 12 letters per second. It filled two racks and it worked. It had only one trouble: it costed too much w.r.t. competion, which used a few MIPS processors.

nikita

Quote from: Sengin on June 12, 2008, 11:55:28 AM
It's not really a priority, I've just been really interested in multi-threading lately.  I kept coming up with ideas that may or may not work in improving efficiency (as much as possible, as in down to the pushing and popping of registerse) and couldn't find anything related online
Buy a book.
Seriously.. you're not the first person out there who wonders about that kind of stuff. There should be plenty of literature about both how to avoid the usual pitfalls of multiprocessing like deadlocks, mutual waiting etc. and asm code optimizations.  :)

If you want to keep searching the internet, try google scholar http://scholar.google.com/ when looking for research papers.