Maker Pro
Maker Pro

Video Motion Detection Algorithms--How Does This Work?

W

W. eWatson

Jan 1, 1970
0
I'm trying to determine how the algorithm described below works. Perhaps
it's a common simple method. It's for a b/w video camera and a mask is
used to block areas that are not of interest.

-----------------
When performing frame-to-frame differencing, the video motion detection
box counts the number of pixels that increase in brightness by more than
the corresponding threshold value in the image mask. When this number of
above-threshold pixels exceeds the number in the Trigger Threshold field
for two frames in a row, an event occurs. Once an event occurs, the
event is not complete until the number of above-threshold pixels falls
below the number in the Untrigger Threshold field for two frames in a row.
------------------
The description more or less sounds good, but interplay between the two
image frames and the mask is not clear to me. Is a difference made
between the two images, and then against the mask values or is the mask
applied against each image and the two results compared against one
another?

In this case, a mask is produced by assigning a value from 0 to 255 to
each pixel. For a pixel to be considered as a motion the imaged value
for the pixel must be greater than the mask value. A simple flat mask is
to assign a value of, say, 50 to all pixels and consider only pixels in
the image that exceed 50 in brightness.

P.S.
I happened to find an old text on this, Digital Image Processing, by
Gonzael and Wintz, and noticed they discuss such matters in a chapter
titled Image Segmentation.
 
J

Jon Slaughter

Jan 1, 1970
0
W. eWatson said:
I'm trying to determine how the algorithm described below works.
Perhaps it's a common simple method. It's for a b/w video camera and
a mask is used to block areas that are not of interest.

-----------------
When performing frame-to-frame differencing, the video motion
detection box counts the number of pixels that increase in brightness
by more than the corresponding threshold value in the image mask.
When this number of above-threshold pixels exceeds the number in the
Trigger Threshold field for two frames in a row, an event occurs.
Once an event occurs, the event is not complete until the number of
above-threshold pixels falls
below the number in the Untrigger Threshold field for two frames in a
row. ------------------
The description more or less sounds good, but interplay between the
two image frames and the mask is not clear to me. Is a difference made
between the two images, and then against the mask values or is the
mask applied against each image and the two results compared against
one another?

In this case, a mask is produced by assigning a value from 0 to 255 to
each pixel. For a pixel to be considered as a motion the imaged value
for the pixel must be greater than the mask value. A simple flat mask
is to assign a value of, say, 50 to all pixels and consider only
pixels in the image that exceed 50 in brightness.

P.S.
I happened to find an old text on this, Digital Image Processing, by
Gonzael and Wintz, and noticed they discuss such matters in a chapter
titled Image Segmentation.

It has to do with the optical flow. For motion, color is irrelevant for the
most part and only the change in luminance is important.

By knowing the optical flow you have some idea how something is moving
because if the object is static then the light reflecting off it will not
change. It doesn't work in all situations since it only detects relative
motion(although I imagine more advanced algorithms could determine if the
light or the object is what is moving).

Anyways, look up optical flow...
 
M

MeowSayTongue

Jan 1, 1970
0
When performing frame-to-frame differencing, the video motion detection
box counts the number of pixels that increase in brightness by more than
the corresponding threshold value in the image mask.


The pixel rows get scanned. When the difference is detected, it is
noted. Then, changes from frame to frame are noted.

The compression schema is to compress the known, non-changing frame
areas into short strings that describe those screen areas. These strings
take up far less per frame space in the data stream than the entire frame
data would.

So motion detection would really be properly termed as change
detection, and it is used to compress video. General Instrument (now
Motorola) can compress 12 video channels into one standard 6MHz wide
video channel. That is a LOT of compression.

Since changes in scenes on motion video usually result from motion with
respect to the frame itself, it is termed motion detection, but it really
is "change detection", because the "engine" that does the work doesn't
know anything about "motion", it only knows what changed, with typically
equates with motion, so WE call it motion detection.
 
M

MeowSayTongue

Jan 1, 1970
0
(although I imagine more advanced algorithms could determine if the
light or the object is what is moving).


They do no such thing. They merely note changes in pixel values on a
row by row basis.

It is a mis-nomer to call it motion detection, but since most changes
in a video scene relate to motion of either the camera through the scene,
or the elements in the scene, it is termed motion detection.

What it really is, and there is no special brains needed or
determinations of what is moving or any of the stuff I see folks saying
here.

It is simply CHANGE detection. That ALWAYS relates to motion within
the frame, regardless of whether it is an element in the frame moving or
the camera panning through a scene. The 'engine' does not know or care
what the reasons are for the changes. All it is doing is tracking said
changes to facilitate HUGE compression capacities, which is the entire
reason that "motion detection" is even used.
 
M

MeowSayTongue

Jan 1, 1970
0
I do this a bit different.
I subtract the previous frame from the current frame.


That only works when each frame is sent in the data.

With compression, which is what motion detection is for, the previous
frame may be a compilation of other frames, and or data that was
considered non-changing. There are also repeated frames.

You would not be able to do that from an MPEG2 stream.
 
M

MeowSayTongue

Jan 1, 1970
0
and if that value exceeds a preset value,
the threshold, you have some motion,


Hahaha. Motion detection is not merely detecting if motion occurred.
Motion detection is specifically so that subsequent frames can be sent
using about one tenth of the data.
 
W

W. eWatson

Jan 1, 1970
0
ChrisQ said:
From what i've read, some systems split the frame into blocks to
determine the size of the object. This allows differentiatiation between
(for example) small animals or birds and humans...

Regards,

Chris
Well, let's try this. Suppose we have six frames (each a matrix of pixel
values, and all the same size), f1, f2, f3, f4, f5 and f6. Call m0 the
mask, which is the same size as any frame. Further, let's suppose all
values are 70 (in a range from 0 to 255). In this application, 40 means
that a value in a comparison frame lower than or equal to 70 is to be
ignored. f6 is the current frame. I'm assuming that if a pixel in f6 is
90 and the same in f1 is 100, then the result is -10, which turns into
0. All resulting pixels in the subtraction are either above 70 or are
0. Now let's take an example, 2x2.

m0 70 70 Note a mask could be a mix of values:
70 70 30 80
40 55

f6 10 25
90 75

f1 75 95
80 45

How do we proceed to get something out of this to determine if there's
motion? I would like to think that f6-f1 is a start, but how does ond
introduce m0? That is, how does one produce a final matrix in which one
counts the number of values > 0 is above a threshold value of say 2?

Is f6-f1 really needed? One could look at the three and say the result
is really, where I've zeroed out values below or equal to 70:
f6 0 25
90 75

f1 75 95
80 0

If one subtracts the two,
-75 -70
10 75

So do we count positive values here?
 
W

W. eWatson

Jan 1, 1970
0
Ah, an authority on the code has clued me into what is really happening.
I'll be back later to straighten this out. It's a question of whether a
pixel has brightened the amount given in the mask. Also negative and
positive values play a strong role in this.
 
M

MeowSayTongue

Jan 1, 1970
0
How do we proceed to get something out of this to determine if there's
motion? I would like to think that f6-f1 is a start,


Well, you can start getting into edge detection algorithms to discern
between motion and mere brightness change.

That differentiates between a moving or varying light source and a
moving object.

You also have to determine if the camera is moving instead of an object
in the camera's view.

This is easy if the camera is always stationary.

Meow
 
G

GM

Jan 1, 1970
0
MeowSayTongue said:
Well, you can start getting into edge detection algorithms to discern
between motion and mere brightness change.

Also an average of the X and Y positions of all of the pixels that have
big changes from frame to frame would probably be helpful at least for
detecting movement in the X-Y plane of the camera. This would not work
perfectly if the object was moving towards or away from the camera (the
Z axis), but at that case the spread of the Xs and Ys of the changing
pixels would get bigger or smaller, respectively, and thus the detection
algorithm should also take notice of that.

PS:
Sorry for my poor English :)
 
W

W. eWatson

Jan 1, 1970
0
MeowSayTongue said:
Well, you can start getting into edge detection algorithms to discern
between motion and mere brightness change.

That differentiates between a moving or varying light source and a
moving object.

You also have to determine if the camera is moving instead of an object
in the camera's view.

This is easy if the camera is always stationary.

Meow
The camera is definitely stationary. It has a fisheye lens that is
looking at the entire sky all night long. It's looking for meteors.

My confusion about how it works was that it seemed to me that somehow
only positive integers were allowed, even if the result was negative.
They are allowed and in the final result it's positive numbers that are
counted. Add to that my misinterpretation of the mask. The values are
deltas, and not thresholds themselves.

Briefly the method is subtract f6 from f1, f6-f1. Now subtract m0, which
only has positive numbers, from that result. Call it D. D has both
positive and negative numbers, but only the positive ones show how much
the brightness of the pixel has increased above the mask value. Count
the positive values to see if the total is above some specified value.

I have no idea what this method is called, but it seems to work pretty
well unless the camera produces a lot of noise from cranking up the gain.
 
W

W. eWatson

Jan 1, 1970
0
MeowSayTongue said:
The pixel rows get scanned. When the difference is detected, it is
noted. Then, changes from frame to frame are noted.

The compression schema is to compress the known, non-changing frame
areas into short strings that describe those screen areas. These strings
take up far less per frame space in the data stream than the entire frame
data would.

So motion detection would really be properly termed as change
detection, and it is used to compress video. General Instrument (now
Motorola) can compress 12 video channels into one standard 6MHz wide
video channel. That is a LOT of compression.

Since changes in scenes on motion video usually result from motion with
respect to the frame itself, it is termed motion detection, but it really
is "change detection", because the "engine" that does the work doesn't
know anything about "motion", it only knows what changed, with typically
equates with motion, so WE call it motion detection.
See my post from a few minutes ago.
 
M

MeowSayTongue

Jan 1, 1970
0
The camera is definitely stationary. It has a fisheye lens that is
looking at the entire sky all night long. It's looking for meteors.


Then it would be much better to use high resolution optical recording
setup and a PMT type detector set-up to trigger the record events.

Meow
 
W

W. eWatson

Jan 1, 1970
0
MeowSayTongue said:
Then it would be much better to use high resolution optical recording
setup and a PMT type detector set-up to trigger the record events.

Meow
Actually, the small video camera gives very good results. Some of the
cameras networked within 50-100 miles of one another produce good enough
data to determine the atmospheric trajectory.
 
W

W. eWatson

Jan 1, 1970
0
MeowSayTongue said:
Hahaha. Motion detection is not merely detecting if motion occurred.
Motion detection is specifically so that subsequent frames can be sent
using about one tenth of the data.
Yes, a device between the camera and a PC uses this method to only send
data to the PC if there is possible movement. It cuts down dramatically
the amount of data the PC receives. In fact, if there is an event, it
only sends data in a square box of 120 pixels on a side that follows the
meteor.
 
J

JosephKK

Jan 1, 1970
0
They do no such thing. They merely note changes in pixel values on a
row by row basis.

It is a mis-nomer to call it motion detection, but since most changes
in a video scene relate to motion of either the camera through the scene,
or the elements in the scene, it is termed motion detection.

What it really is, and there is no special brains needed or
determinations of what is moving or any of the stuff I see folks saying
here.

It is simply CHANGE detection. That ALWAYS relates to motion within
the frame, regardless of whether it is an element in the frame moving or
the camera panning through a scene. The 'engine' does not know or care
what the reasons are for the changes. All it is doing is tracking said
changes to facilitate HUGE compression capacities, which is the entire
reason that "motion detection" is even used.

Just because you are all hung up on MPEG and similar compression
systems does not mean that all other systems are the same.
 
Top