1
00:00:01,640 --> 00:00:08,170
PROFESSOR: So the handouts will
be just the problem set 8

2
00:00:08,170 --> 00:00:11,120
solutions, of which you already
have the first two.

3
00:00:11,120 --> 00:00:15,840
Remind you problem set 9 is
due on Friday, but we will

4
00:00:15,840 --> 00:00:17,840
accept it on Monday if that's
when you want to

5
00:00:17,840 --> 00:00:20,230
hand it in to Ashish.

6
00:00:20,230 --> 00:00:24,620
And problem set 10 I will hand
out next week, but you won't

7
00:00:24,620 --> 00:00:27,020
be responsible for it.

8
00:00:27,020 --> 00:00:32,490
You could try it if
you're so moved.

9
00:00:32,490 --> 00:00:33,530
OK.

10
00:00:33,530 --> 00:00:35,370
We're in the middle
of chapter 13.

11
00:00:35,370 --> 00:00:39,540
We've been talking about
capacity approaching codes.

12
00:00:39,540 --> 00:00:43,640
We've talked about a number of
classes of them, low density

13
00:00:43,640 --> 00:00:48,220
parity check, turbo, repeat
accumulate, and I've given you

14
00:00:48,220 --> 00:00:52,610
a general idea of how the sum
product decoding algorithm is

15
00:00:52,610 --> 00:00:54,580
applied to decode these codes.

16
00:00:54,580 --> 00:00:58,093
These are all defined on graphs
with cycles, in the

17
00:00:58,093 --> 00:01:02,330
middle of which is a large
pseudo random interleaver.

18
00:01:02,330 --> 00:01:06,740
The sum product algorithm is
therefore done iteratively.

19
00:01:06,740 --> 00:01:10,400
In general, the initial observed
information comes in

20
00:01:10,400 --> 00:01:13,880
on one side, the left side or
the right side, and the

21
00:01:13,880 --> 00:01:17,880
iterative schedule amounts to
doing first the left side,

22
00:01:17,880 --> 00:01:19,980
then the right side, then the
left side, then the right

23
00:01:19,980 --> 00:01:24,200
side, until you converge,
you hope.

24
00:01:24,200 --> 00:01:27,650
That was the original turbo
idea, and that continues to be

25
00:01:27,650 --> 00:01:30,290
the right way to do it.

26
00:01:30,290 --> 00:01:31,290
OK.

27
00:01:31,290 --> 00:01:33,800
Today, we're actually going to
try to do some analysis.

28
00:01:33,800 --> 00:01:38,300
To do the analysis, we're going
to focus on low density

29
00:01:38,300 --> 00:01:44,330
party check codes, which are
certainly far easier than

30
00:01:44,330 --> 00:01:46,460
turbo codes to analyze,
because they

31
00:01:46,460 --> 00:01:47,760
have such simple elements.

32
00:01:47,760 --> 00:01:51,580
I guess the repeat accumulate
codes are equally easy to

33
00:01:51,580 --> 00:01:55,510
analyze, but maybe not as
good in performance.

34
00:01:55,510 --> 00:01:58,200
Maybe they're as good,
I don't know.

35
00:01:58,200 --> 00:02:00,390
No one has driven that
as far as low density

36
00:02:00,390 --> 00:02:01,640
parity check codes.

37
00:02:03,880 --> 00:02:07,340
Also, we're going to take
a very simple channel.

38
00:02:07,340 --> 00:02:11,110
It's actually the channel for
which most of the analysis has

39
00:02:11,110 --> 00:02:16,170
been done, which is the Binary
Erasure Channel, where

40
00:02:16,170 --> 00:02:19,670
everything reduces to a
one-dimensional problem, and

41
00:02:19,670 --> 00:02:22,980
therefore we can do things
quite precisely.

42
00:02:22,980 --> 00:02:26,850
But this will allow me to
introduce density evolution,

43
00:02:26,850 --> 00:02:30,320
which is the generalization of
this for more general channels

44
00:02:30,320 --> 00:02:33,910
like the binary input added
white Gaussian noise channel,

45
00:02:33,910 --> 00:02:36,400
if I manage to go fast enough.

46
00:02:36,400 --> 00:02:39,340
I apologize today, I
do feel in a hurry.

47
00:02:39,340 --> 00:02:43,670
Nonetheless, please ask
questions whenever you want to

48
00:02:43,670 --> 00:02:47,220
slow me down or just get some
more understanding.

49
00:02:47,220 --> 00:02:50,310
So, the Binary Erasure Channel
is one of the elementary

50
00:02:50,310 --> 00:02:53,260
channels, if you've ever taken
information theory,

51
00:02:53,260 --> 00:02:54,620
that you look at.

52
00:02:54,620 --> 00:02:59,010
It has two inputs and
three outputs.

53
00:02:59,010 --> 00:03:02,910
The two inputs are 0 and 1, the
outputs are 0 and 1 or an

54
00:03:02,910 --> 00:03:06,350
erasure, an ambiguous output.

55
00:03:06,350 --> 00:03:11,110
If you send a 0, you can either
get the 0 correctly, or

56
00:03:11,110 --> 00:03:13,030
you could get an erasure.

57
00:03:13,030 --> 00:03:15,540
Might be a deletion, you just
don't get anything.

58
00:03:15,540 --> 00:03:18,080
Similarly, if you send
a 1, you either

59
00:03:18,080 --> 00:03:19,290
get a 1 or an erasure.

60
00:03:19,290 --> 00:03:22,480
There's no possibility of
getting something incorrectly.

61
00:03:22,480 --> 00:03:26,870
That's the key thing
about this channel.

62
00:03:26,870 --> 00:03:30,050
The probability of an erasure
is p, regardless of whether

63
00:03:30,050 --> 00:03:31,980
you send 0 or 1.

64
00:03:31,980 --> 00:03:36,330
So there's a single parameter
that governs this channel.

65
00:03:36,330 --> 00:03:40,486
Now admittedly, this is not
a very realistic channel.

66
00:03:40,486 --> 00:03:43,580
It's a toy channel in
the binary case.

67
00:03:43,580 --> 00:03:50,330
However, actually some of the
impetus for this development

68
00:03:50,330 --> 00:03:53,080
was the people who were
considering packet

69
00:03:53,080 --> 00:03:55,830
transmission on the internet.

70
00:03:55,830 --> 00:03:58,110
And in the case of packet
transmission on the internet,

71
00:03:58,110 --> 00:04:02,660
of course, you have a long
packet, a very non-binary

72
00:04:02,660 --> 00:04:06,750
symbol if you like, but if you
consider these to be packets,

73
00:04:06,750 --> 00:04:08,880
then on the internet, you either
receive the packet

74
00:04:08,880 --> 00:04:11,950
correctly or you fail
to receive it.

75
00:04:11,950 --> 00:04:16,250
You don't receive it at all, and
you know it, because there

76
00:04:16,250 --> 00:04:19,240
is an internal party check
in each packet.

77
00:04:19,240 --> 00:04:23,220
So the q-ary erasure channel is
in fact a realistic model,

78
00:04:23,220 --> 00:04:26,310
and in fact, there's a company,
Digital Fountain,

79
00:04:26,310 --> 00:04:31,310
that has been founded and is
still going strong as far as I

80
00:04:31,310 --> 00:04:36,510
know, which is specifically
devoted to solutions for this

81
00:04:36,510 --> 00:04:40,250
q-ary erasure channel for
particular kinds of scenarios

82
00:04:40,250 --> 00:04:43,140
on the internet where you want
to do forward error correction

83
00:04:43,140 --> 00:04:46,280
rather than repeat
transmission.

84
00:04:46,280 --> 00:04:51,370
And a lot of the early work on
the analysis of these guys--

85
00:04:51,370 --> 00:04:55,350
Luby, Shokrollahi,
other people--

86
00:04:55,350 --> 00:04:58,840
they were some of the people
who focused on low density

87
00:04:58,840 --> 00:05:03,730
party check codes immediately,
following work of Spielman and

88
00:05:03,730 --> 00:05:10,110
Sipser here at MIT, and said,
OK, suppose we try this on our

89
00:05:10,110 --> 00:05:11,950
q-ary erasure channel.

90
00:05:11,950 --> 00:05:15,440
And they were able to get very
close to the capacity of the

91
00:05:15,440 --> 00:05:18,190
q-ary erasure channel, which
is also 1 minus p.

92
00:05:18,190 --> 00:05:21,230
This is the information
theoretic capacity of the

93
00:05:21,230 --> 00:05:22,110
binary channel.

94
00:05:22,110 --> 00:05:25,940
It's kind of obvious that it
should be 1 minus p, because

95
00:05:25,940 --> 00:05:29,410
on the average, you get 1 minus
p good bits out for

96
00:05:29,410 --> 00:05:30,800
every bit that you send in.

97
00:05:30,800 --> 00:05:34,090
So the maximum rate you could
expect to send over this

98
00:05:34,090 --> 00:05:38,100
channel is 1 minus p.

99
00:05:38,100 --> 00:05:39,350
OK.

100
00:05:41,060 --> 00:05:44,740
Let's first think about maximum
likelihood decoding on

101
00:05:44,740 --> 00:05:45,300
this channel.

102
00:05:45,300 --> 00:05:49,980
Suppose we take a word from a
code, from a binary code, and

103
00:05:49,980 --> 00:05:53,750
send it through this channel,
and we get some erasure

104
00:05:53,750 --> 00:05:55,880
pattern at the output.

105
00:05:55,880 --> 00:05:59,750
So we have a subset of the
bits that are good, and a

106
00:05:59,750 --> 00:06:03,310
subset that are erased
at the output.

107
00:06:03,310 --> 00:06:06,150
Now what does maximum likelihood
decoding amount to

108
00:06:06,150 --> 00:06:07,400
on this channel?

109
00:06:10,630 --> 00:06:15,780
Well, the code word that we sent
is going to match up with

110
00:06:15,780 --> 00:06:19,400
all the good bits
received, right?

111
00:06:19,400 --> 00:06:21,580
So we know that there's going to
be at least one word in the

112
00:06:21,580 --> 00:06:23,940
code that agrees with
the received

113
00:06:23,940 --> 00:06:25,690
sequence in the good places.

114
00:06:28,430 --> 00:06:30,970
If that's the only word in the
code that agrees with the

115
00:06:30,970 --> 00:06:33,310
received word in those places,
then we can declare it the

116
00:06:33,310 --> 00:06:34,900
winner, right?

117
00:06:34,900 --> 00:06:37,070
And maximum likelihood
decoding succeeds.

118
00:06:37,070 --> 00:06:40,130
We know what the channel is, so
we know that all the good

119
00:06:40,130 --> 00:06:43,070
bits have to match up
with the code word.

120
00:06:43,070 --> 00:06:46,120
But suppose there are 2 words
in the code that match up in

121
00:06:46,120 --> 00:06:48,960
all the good places?

122
00:06:48,960 --> 00:06:50,790
There's no way to decide
between them, right?

123
00:06:54,180 --> 00:06:56,900
So basically, that's what
maximum likelihood decoding

124
00:06:56,900 --> 00:06:58,630
amounts to.

125
00:06:58,630 --> 00:07:07,410
You simply check how many
code words match the

126
00:07:07,410 --> 00:07:09,120
received good bits.

127
00:07:09,120 --> 00:07:12,160
If there's only one,
you decode.

128
00:07:12,160 --> 00:07:15,480
If there's more than one, you
could flip a coin, but we'll

129
00:07:15,480 --> 00:07:18,290
consider that to be a
decoding failure.

130
00:07:18,290 --> 00:07:20,820
You just don't know, so you
throw up your hands, you have

131
00:07:20,820 --> 00:07:24,110
a detected decoding failure.

132
00:07:24,110 --> 00:07:28,860
So in the case of a linear code,
what are we doing here?

133
00:07:28,860 --> 00:07:31,950
In the case of a linear
code, consider the

134
00:07:31,950 --> 00:07:33,130
parity check equations.

135
00:07:33,130 --> 00:07:38,030
We basically have n minus k
parity check equations, and

136
00:07:38,030 --> 00:07:44,040
we're trying to find how many
code sequences that solve

137
00:07:44,040 --> 00:07:45,780
those parity check equations.

138
00:07:45,780 --> 00:07:51,350
So we have n minus k equations,
n unknowns, and

139
00:07:51,350 --> 00:07:55,310
we're basically just trying
to solve linear equations.

140
00:07:55,310 --> 00:07:57,240
So that would be one
decoding method for

141
00:07:57,240 --> 00:07:59,250
maximum likelihood decoding.

142
00:07:59,250 --> 00:08:00,370
Solve the equations.

143
00:08:00,370 --> 00:08:02,450
If you get a unique solution,
you're finished.

144
00:08:02,450 --> 00:08:05,970
If you get a space of solutions,
so dimension one or

145
00:08:05,970 --> 00:08:09,270
more, you lose.

146
00:08:09,270 --> 00:08:09,730
OK?

147
00:08:09,730 --> 00:08:14,330
So we know lots of ways of
solving linear equations like

148
00:08:14,330 --> 00:08:17,068
Gaussian elimination, back
propagation/back substitution

149
00:08:17,068 --> 00:08:18,318
(I'm not sure exactly
what it's called).

150
00:08:21,290 --> 00:08:24,140
That's actually what we will
be doing with low density

151
00:08:24,140 --> 00:08:26,630
parity check codes.

152
00:08:26,630 --> 00:08:30,780
And so, decoding for the binary
ratio channel you can

153
00:08:30,780 --> 00:08:33,740
think of as just trying to
solve linear equations.

154
00:08:33,740 --> 00:08:39,840
If you get a unique solution,
you win, otherwise, you fail.

155
00:08:39,840 --> 00:08:45,460
Another way of looking at it
in a linear code is what do

156
00:08:45,460 --> 00:08:48,920
the good bits have to form?

157
00:08:48,920 --> 00:08:52,920
The erased bits have to
be a function of the

158
00:08:52,920 --> 00:08:55,570
good bits, all right?

159
00:08:55,570 --> 00:08:59,450
In linear code, that's just
a function of where

160
00:08:59,450 --> 00:09:00,630
the good bits are.

161
00:09:00,630 --> 00:09:02,940
We've run into this
concept before.

162
00:09:02,940 --> 00:09:05,366
We called it an information
set.

163
00:09:05,366 --> 00:09:09,330
An information set is a subset
of the coordinates that

164
00:09:09,330 --> 00:09:11,760
basically determines the
rest the coordinates.

165
00:09:11,760 --> 00:09:14,150
If you know the bits in
that subset, then you

166
00:09:14,150 --> 00:09:14,840
know the code word.

167
00:09:14,840 --> 00:09:17,870
You can fill out the rest of
the code word through some

168
00:09:17,870 --> 00:09:19,050
linear equation.

169
00:09:19,050 --> 00:09:24,500
So basically, we're going to
succeed if the good bits cover

170
00:09:24,500 --> 00:09:29,840
an information set, and we're
going to fail otherwise.

171
00:09:29,840 --> 00:09:35,200
So how many bits do we need to
cover an information set?

172
00:09:35,200 --> 00:09:37,690
We're certainly going
to need at least k.

173
00:09:37,690 --> 00:09:40,440
Now today, we're going to be
considering very long codes.

174
00:09:40,440 --> 00:09:45,550
So suppose I have a
long (n,k) code.

175
00:09:45,550 --> 00:09:50,050
I have an (n,k) code, and I
transmit it over this channel.

176
00:09:50,050 --> 00:09:52,915
About how many bits are
going to be erased?

177
00:09:55,700 --> 00:09:59,400
About pn bits are going
to be erased, or (1

178
00:09:59,400 --> 00:10:01,380
minus p) times n.

179
00:10:01,380 --> 00:10:03,510
We're going to get
approximately--

180
00:10:03,510 --> 00:10:05,120
law of large numbers--

181
00:10:05,120 --> 00:10:14,340
(1 minus p) times n unerased
bits, and this has to be

182
00:10:14,340 --> 00:10:18,400
clearly greater than k.

183
00:10:18,400 --> 00:10:22,620
OK, so with very high
probability, if we get more

184
00:10:22,620 --> 00:10:27,190
than that, we'll be able to
solve the equations, find a

185
00:10:27,190 --> 00:10:28,020
unique code word.

186
00:10:28,020 --> 00:10:31,390
If we get fewer than that,
there's no possible way we

187
00:10:31,390 --> 00:10:32,550
could solve the equations.

188
00:10:32,550 --> 00:10:34,980
We don't have enough left.

189
00:10:34,980 --> 00:10:37,310
So what does this say?

190
00:10:37,310 --> 00:10:44,600
This says that k over n, which
is the code rate, has to be

191
00:10:44,600 --> 00:10:49,260
less than 1 minus p in order
for this maximum likelihood

192
00:10:49,260 --> 00:10:52,480
decoding to work with a linear
code over the binary erasure

193
00:10:52,480 --> 00:10:57,370
channel, and that is consistent
with this.

194
00:10:57,370 --> 00:11:00,410
If the rate is less than 1 minus
p, then with very high

195
00:11:00,410 --> 00:11:02,480
probability you're going
to be successful.

196
00:11:02,480 --> 00:11:08,360
If it's greater than 1 minus p,
no chance, as n becomes 1.

197
00:11:08,360 --> 00:11:08,890
OK?

198
00:11:08,890 --> 00:11:09,820
You with me?

199
00:11:09,820 --> 00:11:11,070
AUDIENCE: [UNINTELLIGIBLE]?

200
00:11:13,080 --> 00:11:17,090
PROFESSOR: Well, in general,
they're not, and the first

201
00:11:17,090 --> 00:11:21,630
exercise on the homework
says take the 844 code.

202
00:11:21,630 --> 00:11:26,070
There's certain places where if
you erase 4 bits, you lose,

203
00:11:26,070 --> 00:11:27,760
and there are other places
where if you

204
00:11:27,760 --> 00:11:31,100
erase 4 bits, you win.

205
00:11:31,100 --> 00:11:37,630
And that exercise also points
out that the low density

206
00:11:37,630 --> 00:11:39,880
parity check decoding that
we're going to do, the

207
00:11:39,880 --> 00:11:43,520
graphical decoding, may fail
in a case where maximum

208
00:11:43,520 --> 00:11:46,660
likelihood decoding
might work.

209
00:11:46,660 --> 00:11:48,770
But maximum likelihood decoding
is certainly the best

210
00:11:48,770 --> 00:11:53,040
we can do, so it's clear.

211
00:11:53,040 --> 00:11:56,090
You can't signal at a rate
greater than 1 minus p.

212
00:11:56,090 --> 00:11:59,480
You just don't get more than 1
minus p bits of information,

213
00:11:59,480 --> 00:12:03,920
or n times (1 minus p) bits of
information in a code word of

214
00:12:03,920 --> 00:12:08,030
length n, so you can't possibly
communicate more than

215
00:12:08,030 --> 00:12:13,000
n times (1 minus p)
bits in a block.

216
00:12:13,000 --> 00:12:14,250
OK.

217
00:12:16,820 --> 00:12:20,920
So what are we going
to try to do to

218
00:12:20,920 --> 00:12:24,640
signal over this channel?

219
00:12:24,640 --> 00:12:29,500
We're going to try using a low
density parity check code.

220
00:12:29,500 --> 00:12:31,330
Actually, I guess I did
want this first.

221
00:12:34,232 --> 00:12:37,790
Let me talk about both of
these back and forth.

222
00:12:37,790 --> 00:12:39,500
Sorry, Mr. TV guy.

223
00:12:42,000 --> 00:12:45,210
So we're going to use a low
density parity check code, and

224
00:12:45,210 --> 00:12:50,110
initially, we're going to assume
a regular code with,

225
00:12:50,110 --> 00:12:56,860
say, left degree is 3 over
here, right degree is 6.

226
00:12:56,860 --> 00:13:00,650
And we're going to try to decode
by using the iterative

227
00:13:00,650 --> 00:13:03,310
sum product algorithm with
a left right schedule.

228
00:13:07,340 --> 00:13:10,140
OK.

229
00:13:10,140 --> 00:13:13,550
I can work either
here or up here.

230
00:13:13,550 --> 00:13:20,470
What are the rules for sum
product update on a binary

231
00:13:20,470 --> 00:13:22,170
erasure channel?

232
00:13:22,170 --> 00:13:30,220
Let's just start out, and walk
through it a little bit, and

233
00:13:30,220 --> 00:13:33,195
then step back and develop
some more general rules.

234
00:13:35,700 --> 00:13:40,080
What is the message coming
in here that we

235
00:13:40,080 --> 00:13:41,680
receive from the channel?

236
00:13:41,680 --> 00:13:45,340
We're going to convert it
into an APP vector.

237
00:13:45,340 --> 00:13:48,480
What could the APP vector be?

238
00:13:48,480 --> 00:13:56,360
It's either, say, 0, 1 or 1,
0, if the bit is unerased.

239
00:13:56,360 --> 00:14:00,300
So this would be-- if we get
this, we know a posteriori

240
00:14:00,300 --> 00:14:04,145
probability of a 0 is a
1, and of a 1 is a 0.

241
00:14:04,145 --> 00:14:06,240
No question, we have
certainty.

242
00:14:06,240 --> 00:14:10,980
Similarly down here, it's a 1.

243
00:14:10,980 --> 00:14:13,090
And in here, it's 1/2, 1/2.

244
00:14:15,610 --> 00:14:17,310
Complete uncertainty.

245
00:14:17,310 --> 00:14:18,560
No information.

246
00:14:20,610 --> 00:14:22,050
So, we can get--

247
00:14:22,050 --> 00:14:25,570
those are our three
possibilities off the channel.

248
00:14:25,570 --> 00:14:29,320
(0,1), (1,0), (1/2, 1/2).

249
00:14:29,320 --> 00:14:32,380
Now, if we get a certain bit
coming in, what are the

250
00:14:32,380 --> 00:14:36,520
messages going out on each
of these lines here?

251
00:14:36,520 --> 00:14:38,720
We actually only need
to know this one.

252
00:14:38,720 --> 00:14:42,660
Initially, everything inside
here is complete ignorance.

253
00:14:42,660 --> 00:14:44,120
Half, 1/2 everywhere.

254
00:14:44,120 --> 00:14:45,730
You can consider everything
to be erased.

255
00:14:48,420 --> 00:14:48,750
All right.

256
00:14:48,750 --> 00:14:53,470
Well, if we got a known bit
coming in, a 0 or a 1, the

257
00:14:53,470 --> 00:14:55,780
repetition node simply
says propagate

258
00:14:55,780 --> 00:14:57,260
that through all here.

259
00:14:57,260 --> 00:15:01,950
So if you worked out the sum
product update rule for here,

260
00:15:01,950 --> 00:15:07,100
it would basically say, in this
message, if any of these

261
00:15:07,100 --> 00:15:11,980
lines is known, then this line
is known and we have a certain

262
00:15:11,980 --> 00:15:13,950
bit going out.

263
00:15:13,950 --> 00:15:14,370
All right?

264
00:15:14,370 --> 00:15:17,140
So if 0, 1 comes in,
we'll get 0, 1 out.

265
00:15:17,140 --> 00:15:20,210
It's certainly a 1.

266
00:15:20,210 --> 00:15:26,240
Only in the case where all of
these other lines are erased--

267
00:15:26,240 --> 00:15:28,600
all these other incoming
messages are erasures-- then

268
00:15:28,600 --> 00:15:30,180
we don't know anything,
then the output

269
00:15:30,180 --> 00:15:32,260
has to be an erasure.

270
00:15:32,260 --> 00:15:32,780
All right?

271
00:15:32,780 --> 00:15:38,465
So that's the sum product update
rule at a equals node.

272
00:15:38,465 --> 00:15:38,850
All right?

273
00:15:38,850 --> 00:15:44,050
If any of these d minus 1
incoming messages is known,

274
00:15:44,050 --> 00:15:45,820
then the output is known.

275
00:15:45,820 --> 00:15:48,280
If they're all unknown, then
the output is unknown.

276
00:15:52,360 --> 00:15:55,490
You're going to find, in
general, these are the only

277
00:15:55,490 --> 00:15:58,010
kinds of messages we're ever
going to have to deal with.

278
00:15:58,010 --> 00:16:01,920
Either, we're basically going
to take known bits and

279
00:16:01,920 --> 00:16:04,235
propagate them through
the graph--

280
00:16:09,800 --> 00:16:12,770
so initially, everything is
erased, and after awhile, we

281
00:16:12,770 --> 00:16:14,000
start learning things.

282
00:16:14,000 --> 00:16:16,910
More and more things become
known, and we succeed if

283
00:16:16,910 --> 00:16:20,000
everything becomes known
inside the graph.

284
00:16:20,000 --> 00:16:20,380
All right?

285
00:16:20,380 --> 00:16:23,500
So it's just the propagation of
unerased variables through

286
00:16:23,500 --> 00:16:25,272
this graph.

287
00:16:25,272 --> 00:16:26,522
AUDIENCE: [UNINTELLIGIBLE]

288
00:16:29,240 --> 00:16:30,490
PROFESSOR: No.

289
00:16:32,712 --> 00:16:36,400
So they're not only known,
but they're correct.

290
00:16:36,400 --> 00:16:40,550
And like everything else, you
can prove that by induction.

291
00:16:40,550 --> 00:16:43,650
The bits that we receive from
the channel certainly have to

292
00:16:43,650 --> 00:16:46,430
be consistent with the
correct code word.

293
00:16:46,430 --> 00:16:49,110
All these internal constraints
are the constraints of the

294
00:16:49,110 --> 00:16:52,830
code, so we can never generate
an incorrect message.

295
00:16:52,830 --> 00:16:55,080
That's basically the hand
waving proof of that.

296
00:16:57,800 --> 00:16:59,050
OK.

297
00:17:02,510 --> 00:17:06,410
So we're going to propagate
either known bits or erasures

298
00:17:06,410 --> 00:17:08,920
in the first iteration.

299
00:17:08,920 --> 00:17:12,490
And what's the fraction of these
lines that's going to be

300
00:17:12,490 --> 00:17:16,003
erased in a very long code?

301
00:17:16,003 --> 00:17:16,470
AUDIENCE: [UNINTELLIGIBLE]

302
00:17:16,470 --> 00:17:17,579
PROFESSOR: Its going to be p.

303
00:17:17,579 --> 00:17:18,099
All right?

304
00:17:18,099 --> 00:17:25,520
So initially, we have fraction
p erased and fraction 1 minus

305
00:17:25,520 --> 00:17:28,790
p which are good.

306
00:17:28,790 --> 00:17:30,230
OK.

307
00:17:30,230 --> 00:17:33,750
And then, this, we'll take this
to be a perfectly random

308
00:17:33,750 --> 00:17:34,380
interleaver.

309
00:17:34,380 --> 00:17:39,730
So perfectly randomly,
this comes out there.

310
00:17:39,730 --> 00:17:40,980
OK?

311
00:17:44,900 --> 00:17:47,410
All right, so now we have
various messages

312
00:17:47,410 --> 00:17:49,850
coming in over here.

313
00:17:49,850 --> 00:17:54,250
Some are erased, some are known
and correct, and that's

314
00:17:54,250 --> 00:17:56,730
the only things they can be.

315
00:17:56,730 --> 00:17:59,030
All right, what can we do
on the right side now?

316
00:17:59,030 --> 00:18:02,460
On the right side, we have to
execute the sum product

317
00:18:02,460 --> 00:18:05,400
algorithm for a zero sum
node of this type.

318
00:18:07,960 --> 00:18:09,210
What is the rule here?

319
00:18:11,880 --> 00:18:16,710
Clearly, if we get good data
on all these input bits, we

320
00:18:16,710 --> 00:18:18,980
know what the output bit is.

321
00:18:18,980 --> 00:18:22,570
So if we get five good ones over
here, we can tell what

322
00:18:22,570 --> 00:18:23,820
the sixth one has to be.

323
00:18:26,790 --> 00:18:33,690
However, if any of these is
erased, then what's the

324
00:18:33,690 --> 00:18:36,105
probability this
is a 0 or a 1?

325
00:18:36,105 --> 00:18:38,550
It's 1/2, 1/2.

326
00:18:38,550 --> 00:18:42,230
So any erasure here
means we get no

327
00:18:42,230 --> 00:18:43,800
information out of this node.

328
00:18:43,800 --> 00:18:47,230
We get an erasure coming out.

329
00:18:47,230 --> 00:18:56,960
All right, so we come in here,
and if p is some large number,

330
00:18:56,960 --> 00:19:01,200
the rate of this code is 1/2.

331
00:19:01,200 --> 00:19:04,460
So I'm going to do a simulation
for like p equals a

332
00:19:04,460 --> 00:19:06,290
little less--

333
00:19:06,290 --> 00:19:10,240
small enough so that this code
could succeed, 0.4--

334
00:19:10,240 --> 00:19:16,050
so the capacity is
0.6 bits per bit.

335
00:19:16,050 --> 00:19:21,140
But if this is 0.4, what's the
probability that any 5 of

336
00:19:21,140 --> 00:19:23,640
these are all going
to be unerased?

337
00:19:23,640 --> 00:19:24,890
It's pretty small.

338
00:19:28,850 --> 00:19:34,380
So you won't be surprised to
learn that the probability of

339
00:19:34,380 --> 00:19:36,900
an erasure of coming back--

340
00:19:36,900 --> 00:19:38,800
call that q--

341
00:19:38,800 --> 00:19:42,930
equals 0.9 or more,
greater than 0.9.

342
00:19:42,930 --> 00:19:45,330
But it's not 1.

343
00:19:45,330 --> 00:19:49,290
So for some small fraction of
these over here, we're going

344
00:19:49,290 --> 00:19:51,780
to get some information, some
additional information, that

345
00:19:51,780 --> 00:19:53,490
we didn't have before.

346
00:19:53,490 --> 00:19:57,140
And this is going to propagate
randomly back, and it may

347
00:19:57,140 --> 00:20:02,240
allow us to now know some of
these bits that were initially

348
00:20:02,240 --> 00:20:03,490
erased on the channel.

349
00:20:07,620 --> 00:20:10,080
So that's the idea.

350
00:20:10,080 --> 00:20:14,410
So to understand the
performance of

351
00:20:14,410 --> 00:20:18,850
this, we simply track--

352
00:20:18,850 --> 00:20:22,850
let me call this, in general,
the erasure probability going

353
00:20:22,850 --> 00:20:26,680
from left to right, and this,
in general, we'll call the

354
00:20:26,680 --> 00:20:29,770
erasure probability going
from right to left.

355
00:20:32,630 --> 00:20:36,520
And we can actually compute what
these probabilities are

356
00:20:36,520 --> 00:20:39,810
for each iteration under the
assumption that the code is

357
00:20:39,810 --> 00:20:44,090
very long and random so that
every time we make a

358
00:20:44,090 --> 00:20:46,690
computation, we're dealing
with completely fresh and

359
00:20:46,690 --> 00:20:48,230
independent information.

360
00:20:48,230 --> 00:20:49,590
And that's what we're
going to do.

361
00:20:49,590 --> 00:20:50,170
Yes?

362
00:20:50,170 --> 00:20:51,420
AUDIENCE: [UNINTELLIGIBLE]

363
00:20:58,410 --> 00:21:02,120
PROFESSOR: When they come from
the right side, they're either

364
00:21:02,120 --> 00:21:04,180
erased or they're consistent.

365
00:21:07,610 --> 00:21:11,670
I argued before, waving my
hands, that these messages

366
00:21:11,670 --> 00:21:13,400
could never be incorrect.

367
00:21:13,400 --> 00:21:16,680
So if you get 2 known messages,
they can't conflict

368
00:21:16,680 --> 00:21:18,220
with each other.

369
00:21:18,220 --> 00:21:20,302
Is that your concern?

370
00:21:20,302 --> 00:21:20,788
AUDIENCE: Yeah.

371
00:21:20,788 --> 00:21:23,218
Because you're randomly
connecting [UNINTELLIGIBLE],

372
00:21:23,218 --> 00:21:26,620
so it might be that one of the
plus signs gave you an

373
00:21:26,620 --> 00:21:29,174
[UNINTELLIGIBLE], whereas
another plus sign gave you a

374
00:21:29,174 --> 00:21:30,106
proper message.

375
00:21:30,106 --> 00:21:33,050
And they both run back
to the same equation.

376
00:21:33,050 --> 00:21:34,450
PROFESSOR: Well, OK.

377
00:21:34,450 --> 00:21:36,430
So this is pseudo random,
but is chosen for

378
00:21:36,430 --> 00:21:37,280
once and for all.

379
00:21:37,280 --> 00:21:38,340
It determines the code.

380
00:21:38,340 --> 00:21:42,900
I don't re-choose it every time,
but when I analyze it,

381
00:21:42,900 --> 00:21:46,140
I'll assume that it's random
enough so that the bits that

382
00:21:46,140 --> 00:21:48,630
enter into any one calculation
are bits that I've never seen

383
00:21:48,630 --> 00:21:51,690
before, and therefore can be
taken to be entirely random.

384
00:21:51,690 --> 00:21:55,180
But of course, in actual
practice, you've got a fixed

385
00:21:55,180 --> 00:21:57,050
interleaver here, and you
have to, in order

386
00:21:57,050 --> 00:21:59,590
to decode the code.

387
00:21:59,590 --> 00:22:04,010
But the other concern here
is if we actually had the

388
00:22:04,010 --> 00:22:07,670
possibility of errors, the pure
binary erasure channel

389
00:22:07,670 --> 00:22:08,680
never allows errors.

390
00:22:08,680 --> 00:22:12,450
If this actually allowed a 0 to
go to a 1 or a 1 to go to

391
00:22:12,450 --> 00:22:14,860
0, then we'd have an altogether
different situation

392
00:22:14,860 --> 00:22:18,920
over here, and we'd have to
simply honestly compute the

393
00:22:18,920 --> 00:22:22,110
sum product algorithm and what
is the APP if we have some

394
00:22:22,110 --> 00:22:23,580
probability of error.

395
00:22:23,580 --> 00:22:25,890
And they could conflict, and
we'd have to weigh the

396
00:22:25,890 --> 00:22:29,340
evidence, and take the
dominating evidence, or mix it

397
00:22:29,340 --> 00:22:31,640
all up into the single
parameter

398
00:22:31,640 --> 00:22:32,890
that we call the APP.

399
00:22:36,460 --> 00:22:37,710
All right.

400
00:22:39,710 --> 00:22:45,695
So let me now do a
little analysis.

401
00:22:45,695 --> 00:22:48,060
Actually, I've done this
a couple places.

402
00:22:50,760 --> 00:22:55,110
Suppose the probability
of erasure here--

403
00:22:55,110 --> 00:23:00,620
this is the q right
to left parameter.

404
00:23:00,620 --> 00:23:06,160
Suppose the probability of q
right to left is 0.9, or

405
00:23:06,160 --> 00:23:13,420
whatever, and this is the
original received message from

406
00:23:13,420 --> 00:23:16,685
the channel, which had an
erasure probability of p.

407
00:23:19,480 --> 00:23:22,530
What's the q left to right?

408
00:23:22,530 --> 00:23:28,036
What's the erasure probability
for the outgoing message?

409
00:23:28,036 --> 00:23:31,390
Well, the outgoing message is
erased only if all of these

410
00:23:31,390 --> 00:23:34,480
incoming messages are erased.

411
00:23:34,480 --> 00:23:38,450
All right, so this is simply
p times q right to

412
00:23:38,450 --> 00:23:42,960
left, d minus 1.

413
00:23:42,960 --> 00:23:44,045
OK?

414
00:23:44,045 --> 00:23:45,295
AUDIENCE: [UNINTELLIGIBLE]

415
00:23:47,910 --> 00:23:49,860
PROFESSOR: Assuming it's
a long random code, so

416
00:23:49,860 --> 00:23:52,580
everything here is
independent.

417
00:23:52,580 --> 00:23:55,360
I'll say something else about
this in just a second.

418
00:23:58,150 --> 00:24:01,495
But let's naively make that
assumption right now, and then

419
00:24:01,495 --> 00:24:04,830
see how best we can
justify it.

420
00:24:04,830 --> 00:24:06,020
What's the rule over here?

421
00:24:06,020 --> 00:24:11,770
Here, we're over on the right
side if we want to compute the

422
00:24:11,770 --> 00:24:13,430
right to left.

423
00:24:13,430 --> 00:24:18,140
If these are all erased with
probability q left to right,

424
00:24:18,140 --> 00:24:24,944
what is the probability that
this one going out is erased?

425
00:24:24,944 --> 00:24:27,770
Well, it's easier to compute
here the probability of not

426
00:24:27,770 --> 00:24:29,050
being erased.

427
00:24:29,050 --> 00:24:34,230
This is not erased only if all
of these are not erased.

428
00:24:34,230 --> 00:24:38,800
So we get q right to left.

429
00:24:38,800 --> 00:24:43,780
One minus q right to left is
equal to 1 minus q left to

430
00:24:43,780 --> 00:24:48,070
right, to the d minus 1.

431
00:24:48,070 --> 00:24:51,630
And let's see, this is d right,
and this is d left.

432
00:24:54,830 --> 00:24:58,770
I'm doing it for the
specific context.

433
00:24:58,770 --> 00:25:05,320
OK, so under the independence
assumption, we can compute

434
00:25:05,320 --> 00:25:09,900
exactly what these evolving
erasure probabilities are as

435
00:25:09,900 --> 00:25:12,520
we go through this left
right iteration.

436
00:25:12,520 --> 00:25:16,660
This is what's so neat about
this whole thing.

437
00:25:16,660 --> 00:25:22,430
Now, here's the best argument
for why these are all

438
00:25:22,430 --> 00:25:24,440
independent.

439
00:25:24,440 --> 00:25:30,690
Let's look at the messages
that enter into, say, a

440
00:25:30,690 --> 00:25:31,950
particular--

441
00:25:31,950 --> 00:25:35,610
this is computing q left
to right down here.

442
00:25:35,610 --> 00:25:40,030
All right, we've got something
coming in, one bit here.

443
00:25:40,030 --> 00:25:45,860
We've got more bits coming in
up here, and here, which

444
00:25:45,860 --> 00:25:48,800
originally came from bits
coming in up here.

445
00:25:48,800 --> 00:25:50,870
We have a tree of computation.

446
00:25:50,870 --> 00:25:54,120
If we went back through this
pseudo random but fixed

447
00:25:54,120 --> 00:25:58,900
interleaver, we could actually
draw this tree for every

448
00:25:58,900 --> 00:26:04,790
instance of every computation,
and this would be q left to

449
00:26:04,790 --> 00:26:07,610
right at the nth iteration,
this is--

450
00:26:07,610 --> 00:26:08,882
I'm sorry.

451
00:26:08,882 --> 00:26:12,470
Yeah, this is q left to right at
the nth iteration, this is

452
00:26:12,470 --> 00:26:17,630
q right to left at the n minus
first iteration, this is q

453
00:26:17,630 --> 00:26:22,300
left to right at the n minus
first iteration, and so forth.

454
00:26:26,500 --> 00:26:31,190
Now, the argument is
that if I go back--

455
00:26:31,190 --> 00:26:36,050
let's fix the number of
iterations I go back here--

456
00:26:36,050 --> 00:26:40,690
m, let's say, and I want to do
an analysis of the first m

457
00:26:40,690 --> 00:26:43,400
iterations.

458
00:26:43,400 --> 00:26:48,380
I claim that as this code
becomes long, n goes to

459
00:26:48,380 --> 00:26:54,340
infinity with fixed d lambda,
d rho, that the probability

460
00:26:54,340 --> 00:26:58,220
you're ever going to run into
repeated bit or message up

461
00:26:58,220 --> 00:27:00,780
here goes to 0.

462
00:27:00,780 --> 00:27:02,230
All right?

463
00:27:02,230 --> 00:27:04,510
So I fix the number
of iterations I'm

464
00:27:04,510 --> 00:27:05,220
going to look at.

465
00:27:05,220 --> 00:27:07,500
I let the length of the
code go to infinity.

466
00:27:07,500 --> 00:27:11,340
I let everything be chosen
pseudo randomly over here.

467
00:27:11,340 --> 00:27:17,570
Then the probability of seeing
the same message or bit twice

468
00:27:17,570 --> 00:27:19,680
in this tree goes to 0.

469
00:27:19,680 --> 00:27:22,740
And therefore, in that limit,
the independence assumption

470
00:27:22,740 --> 00:27:23,590
become valid.

471
00:27:23,590 --> 00:27:27,390
That is basically the
argument, all right?

472
00:27:27,390 --> 00:27:29,840
So I can analyze any
fixed number of

473
00:27:29,840 --> 00:27:31,190
iterations in this way.

474
00:27:39,142 --> 00:27:40,392
AUDIENCE: [UNINTELLIGIBLE]

475
00:27:42,640 --> 00:27:43,770
PROFESSOR: OK, yes.

476
00:27:43,770 --> 00:27:45,140
Good.

477
00:27:45,140 --> 00:27:51,010
So this is saying the girth
is probabilistically--

478
00:27:51,010 --> 00:27:59,860
so limit in probability going
to infinity, or it's also

479
00:27:59,860 --> 00:28:02,110
referred to as the locally
tree-like assumption.

480
00:28:05,330 --> 00:28:09,260
OK, graph in the neighborhood
of any node--

481
00:28:09,260 --> 00:28:12,380
this is kind of a map of the
neighborhood back for a

482
00:28:12,380 --> 00:28:13,630
distance of m--

483
00:28:16,730 --> 00:28:18,640
we're not ever going to
run into any cycles.

484
00:28:23,530 --> 00:28:26,420
Good, thank you.

485
00:28:26,420 --> 00:28:29,270
OK, so under that assumption,
now we

486
00:28:29,270 --> 00:28:30,760
can do an exact analysis.

487
00:28:30,760 --> 00:28:32,010
This is what's amazing.

488
00:28:35,760 --> 00:28:36,760
And how do we do it?

489
00:28:36,760 --> 00:28:39,180
Here's a good way of doing it.

490
00:28:39,180 --> 00:28:42,120
We just draw the curves of these
2 equations, and we go

491
00:28:42,120 --> 00:28:43,480
back and forth between them.

492
00:28:46,180 --> 00:28:49,980
And this was actually a
technique invented earlier for

493
00:28:49,980 --> 00:28:53,130
turbo codes, but it works very
nicely for low density parity

494
00:28:53,130 --> 00:28:54,890
check code analysis.

495
00:28:54,890 --> 00:28:57,280
It's called the exit chart.

496
00:28:57,280 --> 00:29:01,570
I've drawn it in a somewhat
peculiar way, but it's so that

497
00:29:01,570 --> 00:29:04,360
it will look like the exit
charts you might see an in the

498
00:29:04,360 --> 00:29:06,318
literature.

499
00:29:06,318 --> 00:29:10,100
So I'm just drawing q right to
left on this axis, and q left

500
00:29:10,100 --> 00:29:11,870
to right on this axis.

501
00:29:11,870 --> 00:29:15,250
I want to sort of start in the
lower left and work my way up

502
00:29:15,250 --> 00:29:18,030
to the upper right, which
is the way exit

503
00:29:18,030 --> 00:29:19,160
charts always work.

504
00:29:19,160 --> 00:29:23,610
So to do that, I basically
invert the axis and take it

505
00:29:23,610 --> 00:29:25,410
from 1 down to 0.

506
00:29:25,410 --> 00:29:27,640
Initially, both of these--

507
00:29:27,640 --> 00:29:30,350
the probability is one that
everything is erased

508
00:29:30,350 --> 00:29:34,830
internally on every edge, and if
things work out, we'll get

509
00:29:34,830 --> 00:29:38,390
up to the point where nothing
is erased with high

510
00:29:38,390 --> 00:29:39,640
probability.

511
00:29:41,500 --> 00:29:46,290
OK, these are our 2 equations
just copied from over there

512
00:29:46,290 --> 00:29:49,810
for the specific case of left
degree equals 3 and right

513
00:29:49,810 --> 00:29:52,900
degree equals 6.

514
00:29:52,900 --> 00:29:56,650
And so I just plot the curves
of these 2 equations.

515
00:29:56,650 --> 00:30:02,090
This is done in the notes, and
the important thing is that

516
00:30:02,090 --> 00:30:08,660
the curves don't cross, for
a value of p equal 0.4.

517
00:30:08,660 --> 00:30:13,030
One of these curves depends on
p, the other one doesn't.

518
00:30:13,030 --> 00:30:16,390
So this is just a simple little
quadratic curve here,

519
00:30:16,390 --> 00:30:19,150
and this is a fifth order
curve, and they look

520
00:30:19,150 --> 00:30:22,000
something like this.

521
00:30:22,000 --> 00:30:22,870
What does this mean?

522
00:30:22,870 --> 00:30:25,400
Initially, the q right
to left is 1.

523
00:30:25,400 --> 00:30:30,800
If I go through one iteration,
using the fact that I get this

524
00:30:30,800 --> 00:30:34,010
external information--

525
00:30:34,010 --> 00:30:35,610
extrinsic information--

526
00:30:35,610 --> 00:30:39,570
then q left to right becomes
0.4, so we do to the outer.

527
00:30:39,570 --> 00:30:45,130
Now, I have q left to right
propagating to the right side,

528
00:30:45,130 --> 00:30:50,290
and at this point, I get
something like 0.922, I think

529
00:30:50,290 --> 00:30:52,700
is the first one.

530
00:30:52,700 --> 00:30:55,280
So the q right to left
has gone from

531
00:30:55,280 --> 00:30:58,210
1 down to 0.9 something.

532
00:30:58,210 --> 00:31:00,260
OK, but that's better.

533
00:31:00,260 --> 00:31:05,050
Now, with that value of q, of
course I get a much more

534
00:31:05,050 --> 00:31:07,510
favorable situation
on the left.

535
00:31:07,510 --> 00:31:10,730
I go over to the left
side, and now I get

536
00:31:10,730 --> 00:31:14,240
some p equal to--

537
00:31:14,240 --> 00:31:17,670
this is all done
in the notes--

538
00:31:17,670 --> 00:31:20,880
0.34.

539
00:31:20,880 --> 00:31:23,790
So I've reduced my erasure
probability going from left to

540
00:31:23,790 --> 00:31:33,000
right, which in turn, helps me
out as I go over here, 0.875,

541
00:31:33,000 --> 00:31:35,490
and so forth.

542
00:31:35,490 --> 00:31:36,050
Are you with me?

543
00:31:36,050 --> 00:31:38,730
Does everyone see
what I'm doing?

544
00:31:38,730 --> 00:31:41,050
Any questions?

545
00:31:41,050 --> 00:31:44,250
Again, I'm claiming this is
an exact calculation--

546
00:31:44,250 --> 00:31:46,050
or I would call it
a simulation--

547
00:31:46,050 --> 00:31:48,830
of what the algorithm does
in each iteration.

548
00:31:48,830 --> 00:31:52,810
First iteration, first full,
left, right, right left, you

549
00:31:52,810 --> 00:31:53,490
get to here.

550
00:31:53,490 --> 00:31:56,840
Second one, you get to
here, and so forth.

551
00:31:56,840 --> 00:32:00,640
And I claim as n goes to
infinity, and everything is

552
00:32:00,640 --> 00:32:05,530
random, this is the
way the erasure

553
00:32:05,530 --> 00:32:06,800
probabilities will evolve.

554
00:32:09,440 --> 00:32:17,320
And it's clear visually that if
the curves don't cross, we

555
00:32:17,320 --> 00:32:21,060
get to the upper right corner,
which means decoding succeeds.

556
00:32:21,060 --> 00:32:26,640
There are no erasures anywhere
at the end of the day.

557
00:32:26,640 --> 00:32:29,790
And furthermore, you go and
you take a very long code,

558
00:32:29,790 --> 00:32:33,050
like 10 to the seventh bits,
and you simulate it on this

559
00:32:33,050 --> 00:32:36,120
channel, and it will behave
exactly like this.

560
00:32:36,120 --> 00:32:40,100
OK, so this is really a good
piece of analysis.

561
00:32:40,100 --> 00:32:43,810
So this reduces it to
very simple terms.

562
00:32:43,810 --> 00:32:48,760
We have 2 equations, and of
course they meet here at the

563
00:32:48,760 --> 00:32:50,700
(0,0) point.

564
00:32:50,700 --> 00:32:52,970
Substitute 0 in here,
you get 0 there.

565
00:32:52,970 --> 00:32:55,690
Substance 0 here,
you get 0 there.

566
00:32:55,690 --> 00:32:59,490
But if they don't meet anywhere
else, if there's no

567
00:32:59,490 --> 00:33:05,860
fixed point to this iterative
convergence, then decoding is

568
00:33:05,860 --> 00:33:07,680
going to succeed.

569
00:33:07,680 --> 00:33:10,920
So this is the whole question:
can we design 2 curves that

570
00:33:10,920 --> 00:33:12,170
don't cross?

571
00:33:22,020 --> 00:33:23,270
OK.

572
00:33:24,980 --> 00:33:29,330
So what do we expect
now to happen?

573
00:33:29,330 --> 00:33:32,670
Suppose we increase p.

574
00:33:32,670 --> 00:33:37,540
Suppose we increase p to 0.45,
which is another case that's

575
00:33:37,540 --> 00:33:41,580
considered in the notes,
what's going to happen?

576
00:33:41,580 --> 00:33:43,990
This curve is just a simple
quadratic, it's going to be

577
00:33:43,990 --> 00:33:45,720
dragged down a little bit.

578
00:33:45,720 --> 00:33:52,570
We're going to get some
different curve, which is just

579
00:33:52,570 --> 00:33:57,220
this curve scaled by
0.45 over 0.4.

580
00:33:57,220 --> 00:33:59,790
It's going to start here,
and it's going to

581
00:33:59,790 --> 00:34:00,980
be this scaled curve.

582
00:34:00,980 --> 00:34:03,895
And unfortunately, those
2 curves cross.

583
00:34:07,550 --> 00:34:12,510
So that's the way it's going
to look, and now, again, we

584
00:34:12,510 --> 00:34:18,429
can simulate iterative decoding
for this case.

585
00:34:18,429 --> 00:34:20,290
Again, initially,
we'll start out.

586
00:34:20,290 --> 00:34:24,760
We'll go from 1, 0.45 will be
our right going erasure

587
00:34:24,760 --> 00:34:25,350
probability.

588
00:34:25,350 --> 00:34:28,980
We'll go over here, make
some progress, but

589
00:34:28,980 --> 00:34:30,690
what's going to happen?

590
00:34:30,690 --> 00:34:32,395
We're going to get stuck
right there.

591
00:34:36,260 --> 00:34:37,639
So we find the fixed point.

592
00:34:37,639 --> 00:34:40,480
In fact, this simulation is
a very efficient way of

593
00:34:40,480 --> 00:34:45,170
calculating what the fixed point
of these 2 curves are.

594
00:34:45,170 --> 00:34:47,770
Probably some of you are
analytical whizzes and can do

595
00:34:47,770 --> 00:34:50,350
it analytically, but
it's not that easy

596
00:34:50,350 --> 00:34:51,600
for a quintic equation.

597
00:34:55,699 --> 00:35:00,430
In any case, as far as decoding
is concerned--

598
00:35:00,430 --> 00:35:03,390
all right, this code doesn't
work on an erasure channel

599
00:35:03,390 --> 00:35:05,940
which has an erasure probability
of 0.45.

600
00:35:05,940 --> 00:35:10,120
It does work on one that has an
erasure probability of 0.4.

601
00:35:10,120 --> 00:35:16,770
That should suggest
to you-- yeah?

602
00:35:16,770 --> 00:35:18,020
AUDIENCE: [UNINTELLIGIBLE]

603
00:35:20,520 --> 00:35:23,520
PROFESSOR: Yes, so this code
doesn't get to capacity.

604
00:35:23,520 --> 00:35:24,770
Too bad.

605
00:35:27,690 --> 00:35:33,540
So I'm not claiming that a
regular d left equals 3, d

606
00:35:33,540 --> 00:35:37,010
right equals 6 LDPC code
can achieve capacity.

607
00:35:40,030 --> 00:35:44,030
There's some threshold for p,
below which it'll work, and

608
00:35:44,030 --> 00:35:45,830
above which it won't work.

609
00:35:45,830 --> 00:35:50,460
That threshold is somewhere
between 0.4 and 0.45.

610
00:35:50,460 --> 00:35:53,510
In fact, it's 0.429 something
or other.

611
00:35:53,510 --> 00:36:00,470
So this design approach will
succeed it's near capacity,

612
00:36:00,470 --> 00:36:02,790
but I certainly don't
claim this is a

613
00:36:02,790 --> 00:36:04,350
capacity approaching code.

614
00:36:13,360 --> 00:36:17,920
I might mention now something
called the area theorem,

615
00:36:17,920 --> 00:36:20,310
because it's easy to do
now and it will be

616
00:36:20,310 --> 00:36:22,730
harder to do later.

617
00:36:22,730 --> 00:36:24,140
What is this area here?

618
00:36:28,340 --> 00:36:31,240
I'm saying the area above
this curve here.

619
00:36:35,344 --> 00:36:38,740
Well, you can do that simply
by integrating this.

620
00:36:38,740 --> 00:36:46,870
It's integral of p times
q-squared dq from 0 to 1, and

621
00:36:46,870 --> 00:36:49,840
it turns out to be p over 3.

622
00:36:49,840 --> 00:36:51,090
Believe me?

623
00:36:54,360 --> 00:37:00,130
Which happens to be p over
the left degree.

624
00:37:00,130 --> 00:37:06,230
Not fortuitously, because this
is the left degree minus 1.

625
00:37:06,230 --> 00:37:08,800
So you're always going to get
p over the left degree.

626
00:37:11,780 --> 00:37:14,900
And what's the area
under here?

627
00:37:14,900 --> 00:37:19,070
Well, I can compute--

628
00:37:19,070 --> 00:37:22,340
basically change variables to
1 minus q, q prime, and 1

629
00:37:22,340 --> 00:37:26,590
minus q is q prime over here,
and so I'll get the same kind

630
00:37:26,590 --> 00:37:31,590
of calculation, 0 to 1, this
time q to the fifth over pq,

631
00:37:31,590 --> 00:37:36,020
which is 1/6, which not

632
00:37:36,020 --> 00:37:39,370
fortuitously is 1 over d right.

633
00:37:39,370 --> 00:37:46,920
So the area here is p over
3, and the area here is--

634
00:37:46,920 --> 00:37:49,460
under this side of
the curve is--

635
00:37:49,460 --> 00:37:51,340
that must be 5/6.

636
00:37:51,340 --> 00:37:55,090
Sorry, so the area under
this side is 1/6 so

637
00:37:55,090 --> 00:37:56,400
it's 1 minus this.

638
00:38:07,490 --> 00:38:10,560
It's clearly the big part,
so this is 5/6.

639
00:38:15,560 --> 00:38:15,980
All right.

640
00:38:15,980 --> 00:38:19,230
I've claimed my criterion for
successful decoding is that

641
00:38:19,230 --> 00:38:21,980
these curves not cross.

642
00:38:21,980 --> 00:38:31,010
All right, so for successful
decoding, clearly the sum of

643
00:38:31,010 --> 00:38:35,940
these 2 areas has to be
less than 1, right?

644
00:38:35,940 --> 00:38:50,210
So successful decoding: a
necessary condition is that p

645
00:38:50,210 --> 00:38:52,500
over d_lambda --

646
00:38:52,500 --> 00:38:56,510
let me just extend this
to any regular code--

647
00:38:56,510 --> 00:39:03,050
plus 1 minus 1 over d_rho
has to be less than 1.

648
00:39:09,490 --> 00:39:13,320
OK, what does this sum out to?

649
00:39:13,320 --> 00:39:27,530
This says that p has to be less
than d_lambda over d_rho,

650
00:39:27,530 --> 00:39:30,290
which happens to 1
minus r, right?

651
00:39:32,960 --> 00:39:38,835
Or equivalently, r less than 1
minus p, which is capacity.

652
00:39:42,200 --> 00:39:44,150
So what did I just prove
very quickly?

653
00:39:44,150 --> 00:39:48,800
I proved that for a regular low
density parity check code,

654
00:39:48,800 --> 00:39:54,370
just considering the areas under
these 2 curves and the

655
00:39:54,370 --> 00:39:59,490
requirement that the 2 curves
must not cross, I find that

656
00:39:59,490 --> 00:40:04,200
regular codes can't possibly
work for a rate any greater

657
00:40:04,200 --> 00:40:06,520
than 1 minus p, which
is capacity.

658
00:40:06,520 --> 00:40:10,160
In fact, the rate has to be less
than 1 minus p, strictly

659
00:40:10,160 --> 00:40:13,020
less, in order for there to--

660
00:40:13,020 --> 00:40:16,870
unless we were lucky enough just
to get 2 curves that were

661
00:40:16,870 --> 00:40:18,620
right on top of each other.

662
00:40:18,620 --> 00:40:19,840
I don't know whether that
would work or not.

663
00:40:19,840 --> 00:40:21,330
I guess it doesn't work.

664
00:40:21,330 --> 00:40:23,735
But we'd need them to be
just a scooch apart.

665
00:40:27,010 --> 00:40:30,170
OK, so I can make an inequality
sign here.

666
00:40:30,170 --> 00:40:33,920
OK, well that's rather
gratifying.

667
00:40:38,160 --> 00:40:43,510
What do we do to improve
the situation?

668
00:40:43,510 --> 00:40:45,560
OK, one--

669
00:40:45,560 --> 00:40:47,570
it's probably the first thing
you would think of

670
00:40:47,570 --> 00:40:50,830
investigating maybe at this
point, why don't we look at an

671
00:40:50,830 --> 00:40:53,640
irregular LDPC code?

672
00:41:07,360 --> 00:41:11,320
And I'm going to characterize
such a code by--

673
00:41:11,320 --> 00:41:18,650
there's going to be some
distribution on the left side,

674
00:41:18,650 --> 00:41:22,580
which I might write
by lambda_d.

675
00:41:22,580 --> 00:41:26,920
This is going to be the
fraction of left

676
00:41:26,920 --> 00:41:33,910
nodes of degree d.

677
00:41:33,910 --> 00:41:36,290
All right, I'll simply let that
be some distribution.

678
00:41:36,290 --> 00:41:39,250
Some might have degree 2, some
might have degree 3.

679
00:41:39,250 --> 00:41:44,270
Some might have degree 500.

680
00:41:44,270 --> 00:41:51,000
And similarly, rho_d
is the fraction of

681
00:41:51,000 --> 00:41:54,170
right nodes, et cetera.

682
00:42:01,500 --> 00:42:06,430
And there's some average degree
here, and some average

683
00:42:06,430 --> 00:42:09,250
degree here.

684
00:42:09,250 --> 00:42:12,950
So this is the average degree,
or the typical degree.

685
00:42:16,500 --> 00:42:20,500
This is average left degree,
this is average right degree.

686
00:42:23,990 --> 00:42:26,470
If I do that, then
the calculations

687
00:42:26,470 --> 00:42:29,640
are done in the notes.

688
00:42:29,640 --> 00:42:32,070
I won't take the time to do them
here, but basically you

689
00:42:32,070 --> 00:42:38,190
find the rate of the code is 1
minus the average left degree

690
00:42:38,190 --> 00:42:39,760
over the average right degree.

691
00:42:42,860 --> 00:42:45,180
OK, so it reduces to
the previous case

692
00:42:45,180 --> 00:42:47,820
and the regular case.

693
00:42:47,820 --> 00:42:51,440
Regular case, this is 1 for one
particular degree and 0

694
00:42:51,440 --> 00:42:52,690
for everything else.

695
00:42:56,050 --> 00:42:59,160
It works out.

696
00:42:59,160 --> 00:43:02,910
If I do that and go through
exactly the same analysis with

697
00:43:02,910 --> 00:43:06,690
my computation tree, now I
simply have a distribution of

698
00:43:06,690 --> 00:43:11,710
degrees at each level of the
computation tree, and you will

699
00:43:11,710 --> 00:43:18,040
not be surprised to hear what I
get out as my left to right

700
00:43:18,040 --> 00:43:22,315
equations, is I get out
some average of this.

701
00:43:25,350 --> 00:43:39,090
In fact, what I get out now is
that q left to right is the

702
00:43:39,090 --> 00:43:43,520
sum over d of--

703
00:43:43,520 --> 00:43:55,590
this is going to be lambda_d
times p times q right to left

704
00:43:55,590 --> 00:43:58,810
to the d minus 1.

705
00:43:58,810 --> 00:44:02,140
Which again reduces to the
previous thing, if only one of

706
00:44:02,140 --> 00:44:06,250
these is 1 and the rest are 0.

707
00:44:06,250 --> 00:44:07,500
So I just get the--

708
00:44:09,920 --> 00:44:11,570
this is just an expectation.

709
00:44:11,570 --> 00:44:13,830
This is the fraction
of erasures.

710
00:44:13,830 --> 00:44:17,860
I just count the number of times
I go through a node of

711
00:44:17,860 --> 00:44:20,990
degree d, and for that fraction
of time, I'm going to

712
00:44:20,990 --> 00:44:25,340
get this relationship, and so
I just average over them.

713
00:44:25,340 --> 00:44:26,450
That's very quick.

714
00:44:26,450 --> 00:44:29,733
Look at the notes for a detailed
derivation, but I

715
00:44:29,733 --> 00:44:32,840
hope it's intuitively
plausible.

716
00:44:32,840 --> 00:44:39,812
And similarly, 1 minus q right
to the left is the sum over d

717
00:44:39,812 --> 00:44:48,365
of rho_d, 1 minus q left to
right to the d minus 1.

718
00:44:52,510 --> 00:44:55,970
OK, this is elegantly
done if we

719
00:44:55,970 --> 00:44:59,710
define generating functions.

720
00:44:59,710 --> 00:45:03,490
We do that over here.

721
00:45:03,490 --> 00:45:05,600
I've lost it now so I'll
do it over here.

722
00:45:08,500 --> 00:45:11,230
So what you'll see in the
literature is generating

723
00:45:11,230 --> 00:45:15,920
functions to find is lambda_x
equals sum over d of lambda_d

724
00:45:15,920 --> 00:45:18,740
x to the d minus 1.

725
00:45:18,740 --> 00:45:25,650
And rho of x equals sum over d,
rho_d, x to the d minus 1.

726
00:45:25,650 --> 00:45:28,360
And then these equations
are simply written as--

727
00:45:28,360 --> 00:45:35,410
this is p times lambda of q
right to left, and this is

728
00:45:35,410 --> 00:45:42,710
equal to rho of 1 minus
q left to right.

729
00:45:46,870 --> 00:45:49,630
OK, so we get nice, elegant
generating function

730
00:45:49,630 --> 00:45:50,880
representations.

731
00:45:52,840 --> 00:45:56,350
But from the point of view of
the curves, we're basically

732
00:45:56,350 --> 00:45:58,110
just going to average
these curves.

733
00:45:58,110 --> 00:46:02,520
So we now replace these
equations up here by the

734
00:46:02,520 --> 00:46:03,770
average equations.

735
00:46:11,940 --> 00:46:18,100
This becomes p times lambda of
q right to left, and this

736
00:46:18,100 --> 00:46:25,950
becomes rho of 1 minus
q left to right.

737
00:46:25,950 --> 00:46:30,760
OK, but again, I'm going to
reduce all of this 2 curves,

738
00:46:30,760 --> 00:46:34,770
which again I can use
for a simulation.

739
00:46:34,770 --> 00:46:38,900
And now I have lots of
degrees of freedom.

740
00:46:38,900 --> 00:46:41,630
I could change all these lambdas
and all these rhos,

741
00:46:41,630 --> 00:46:45,180
and I can explore the space,
and that's what's Sae-Young

742
00:46:45,180 --> 00:46:48,530
Chung did in his thesis, not
so much for this channel.

743
00:46:48,530 --> 00:46:53,260
He did do it for this channel,
but also for additive white

744
00:46:53,260 --> 00:46:54,610
Gaussian noise channels.

745
00:46:54,610 --> 00:47:00,980
And so the idea is you try to
make these 2 curves just as

746
00:47:00,980 --> 00:47:02,910
close together as you can.

747
00:47:08,880 --> 00:47:09,820
Something like that.

748
00:47:09,820 --> 00:47:11,730
Or, of course, you can
do other tricks.

749
00:47:11,730 --> 00:47:15,690
You can have some of these--

750
00:47:15,690 --> 00:47:17,300
you can have some bits
over here that go

751
00:47:17,300 --> 00:47:18,250
to the outside world.

752
00:47:18,250 --> 00:47:20,520
You can suppress some
of these bits here.

753
00:47:20,520 --> 00:47:23,800
You can play around
with the graph.

754
00:47:23,800 --> 00:47:25,580
No limit on invention.

755
00:47:25,580 --> 00:47:27,490
But you don't really have
to do any of that.

756
00:47:30,430 --> 00:47:37,790
So it becomes a curve fitting
exercise, and you can imagine

757
00:47:37,790 --> 00:47:39,630
doing this in your thesis,
except you were

758
00:47:39,630 --> 00:47:41,095
not born soon enough.

759
00:47:44,950 --> 00:47:48,090
The interesting point here is
that this now becomes--

760
00:47:48,090 --> 00:47:51,470
the area becomes p over
d_lambda-bar, again,

761
00:47:51,470 --> 00:47:54,320
proof in the notes.

762
00:47:54,320 --> 00:48:00,790
This becomes 1 minus
1 over d_rho-bar.

763
00:48:04,820 --> 00:48:06,720
And so again, the
area theorem--

764
00:48:12,540 --> 00:48:15,880
in order for these curves not to
cross, we've got to have p

765
00:48:15,880 --> 00:48:25,860
over d_lambda-bar plus 1 minus
1 over d_rho-bar, less than

766
00:48:25,860 --> 00:48:30,230
the area of the whole exit
chart, which is 1.

767
00:48:30,230 --> 00:48:33,870
We again find that--

768
00:48:33,870 --> 00:48:39,870
let me put it this way, 1
minus d_lambda-bar over

769
00:48:39,870 --> 00:48:46,450
d_rho-bar is less than 1 minus
p, which is equivalent to the

770
00:48:46,450 --> 00:48:50,660
rate must be less than the
capacity of the channel.

771
00:48:50,660 --> 00:48:52,855
So this is a very nice,
elegant result.

772
00:48:52,855 --> 00:48:56,530
The area theorem says that no
matter how you play with these

773
00:48:56,530 --> 00:48:59,430
degree distributions in an
irregular low-density parity

774
00:48:59,430 --> 00:49:04,790
check code, you of course can
never get above capacity.

775
00:49:04,790 --> 00:49:10,010
But, it certainly suggests that
you might be able to play

776
00:49:10,010 --> 00:49:13,060
around with these curves such
that they get as close as you

777
00:49:13,060 --> 00:49:14,170
might like.

778
00:49:14,170 --> 00:49:17,270
And the converse of this is
that if you can make these

779
00:49:17,270 --> 00:49:20,890
arbitrarily close to each other,
then you can achieve

780
00:49:20,890 --> 00:49:22,830
rates arbitrarily close
to capacity.

781
00:49:26,630 --> 00:49:29,850
And that, in fact, is true.

782
00:49:29,850 --> 00:49:33,060
So simply by going to irregular
low-density parity

783
00:49:33,060 --> 00:49:36,810
check codes, we can get as close
as we like, arbitrarily

784
00:49:36,810 --> 00:49:41,610
close, to the capacity of the
binary erasure channel with

785
00:49:41,610 --> 00:49:44,276
this kind of iterative
decoding.

786
00:49:44,276 --> 00:49:46,320
And you can see the kind
of trade you're

787
00:49:46,320 --> 00:49:47,010
going to have to make.

788
00:49:47,010 --> 00:49:51,160
Obviously, you're going to have
more iterations as these

789
00:49:51,160 --> 00:49:52,670
get very close.

790
00:49:52,670 --> 00:49:55,450
What is the decoding process
going to look like?

791
00:49:55,450 --> 00:49:59,960
It's going to look like very
fine grained steps here, lots

792
00:49:59,960 --> 00:50:02,650
of iterations, but--

793
00:50:02,650 --> 00:50:03,120
all right.

794
00:50:03,120 --> 00:50:04,360
So it's a 100 iterations.

795
00:50:04,360 --> 00:50:07,880
So it's 200 irritations.

796
00:50:07,880 --> 00:50:10,100
These are not crazy numbers.

797
00:50:10,100 --> 00:50:12,570
These are quite feasible
numbers.

798
00:50:12,570 --> 00:50:16,450
And so if you're willing to
do a lot of computation--

799
00:50:16,450 --> 00:50:18,160
which is what you expect,
as you get close

800
00:50:18,160 --> 00:50:19,480
to capacity, right--

801
00:50:19,480 --> 00:50:23,520
you can get as close to capacity
as you like, at least

802
00:50:23,520 --> 00:50:26,270
on this channel.

803
00:50:26,270 --> 00:50:30,010
OK, isn't that great?

804
00:50:30,010 --> 00:50:35,010
It's an easy channel, I grant
you, but everything here is

805
00:50:35,010 --> 00:50:38,000
pretty simple.

806
00:50:38,000 --> 00:50:40,890
All these sum product
updates--

807
00:50:40,890 --> 00:50:45,750
for here, it's just a matter
of basically snow point

808
00:50:45,750 --> 00:50:46,900
propagating erasures.

809
00:50:46,900 --> 00:50:49,870
You just take the
known variables.

810
00:50:49,870 --> 00:50:53,130
You keep computing as many
as you can of them.

811
00:50:53,130 --> 00:50:57,110
Basically, every time an edge
becomes known, you only have

812
00:50:57,110 --> 00:50:59,910
to visit each edge
once, actually.

813
00:50:59,910 --> 00:51:02,220
The first time it becomes known
is the only time you

814
00:51:02,220 --> 00:51:02,850
have to visit it.

815
00:51:02,850 --> 00:51:05,490
After that, you can just
leave it fixed.

816
00:51:05,490 --> 00:51:13,130
All right, so if this has a
linear number of edges, as it

817
00:51:13,130 --> 00:51:15,940
does, by construction, for
either the regular or

818
00:51:15,940 --> 00:51:18,580
irregular case, the complexity
is now going

819
00:51:18,580 --> 00:51:20,690
to be linear, right?

820
00:51:20,690 --> 00:51:22,190
We only have to visit
each edge once.

821
00:51:22,190 --> 00:51:27,640
There are only a number of
edges proportional to n.

822
00:51:27,640 --> 00:51:29,800
So the complexity of this whole
decoding algorithm-- all

823
00:51:29,800 --> 00:51:33,490
you do is, you fix as many edges
as you can, then you go

824
00:51:33,490 --> 00:51:36,670
over here and you try to fix as
many more edges as you can.

825
00:51:36,670 --> 00:51:39,520
You come back here, try to fix
as many more as you can.

826
00:51:39,520 --> 00:51:42,700
It will behave exactly as this
simulation shows it will

827
00:51:42,700 --> 00:51:49,220
behave, and after going back
and forth maybe 100 times--

828
00:51:49,220 --> 00:51:53,470
in more reasonable cases, it's
only 10 or 20 times, it's a

829
00:51:53,470 --> 00:51:56,970
very finite number of times--

830
00:51:56,970 --> 00:51:59,700
you'll be done.

831
00:51:59,700 --> 00:52:05,390
Another qualitative aspect of
this that you already see in

832
00:52:05,390 --> 00:52:08,150
the regular code case--

833
00:52:08,150 --> 00:52:10,760
in fact, you see it very
nicely there-- is that

834
00:52:10,760 --> 00:52:16,450
typically, very typically, you
have an initial period here

835
00:52:16,450 --> 00:52:18,680
where you make a rapid progress
because the curves

836
00:52:18,680 --> 00:52:22,740
are pretty far apart, then you
have some narrow little tunnel

837
00:52:22,740 --> 00:52:25,540
that you have to get through,
and then the

838
00:52:25,540 --> 00:52:27,010
curves widen up again.

839
00:52:27,010 --> 00:52:28,480
I've exaggerated it here.

840
00:52:32,860 --> 00:52:35,730
So OK, you're making great
progress, you're filling in,

841
00:52:35,730 --> 00:52:39,400
lots of edges become known, and
then for a while it seems

842
00:52:39,400 --> 00:52:43,020
like you're making no progress
at all, making very tiny

843
00:52:43,020 --> 00:52:46,630
progress on each iteration.

844
00:52:46,630 --> 00:52:50,730
But then, you get through
this tunnel, and boom!

845
00:52:50,730 --> 00:52:53,330
Things go very fast.

846
00:52:53,330 --> 00:52:55,920
And for this code,
it has a zero--

847
00:52:55,920 --> 00:52:59,670
the regular code has a zero
slope at this point, whereas

848
00:52:59,670 --> 00:53:03,540
this has a non-zero slope.

849
00:53:03,540 --> 00:53:05,600
So these things will go boom,
boom, boom, boom, boom as you

850
00:53:05,600 --> 00:53:08,200
go in there.

851
00:53:08,200 --> 00:53:11,660
So these guys at Digital
Fountain, they called their

852
00:53:11,660 --> 00:53:13,990
second class of codes,
[UNINTELLIGIBLE], tornado

853
00:53:13,990 --> 00:53:15,930
codes, because they
had this effect.

854
00:53:15,930 --> 00:53:18,370
You have to struggle and
struggle, but then when you

855
00:53:18,370 --> 00:53:22,440
finally get it, there's a
tornado, a blizzard, of known

856
00:53:22,440 --> 00:53:24,661
edges, and all of a sudden, all
the edges become known.

857
00:53:27,960 --> 00:53:31,760
Oh by the way, this could
be done for packets.

858
00:53:31,760 --> 00:53:32,870
There's nothing--

859
00:53:32,870 --> 00:53:36,380
you know, this is a repetition
for a packet, and this is a

860
00:53:36,380 --> 00:53:38,800
bit-wise parity check
for a packet.

861
00:53:38,800 --> 00:53:42,200
So the same diagram works
perfectly well for packet

862
00:53:42,200 --> 00:53:43,400
transmission.

863
00:53:43,400 --> 00:53:44,390
That's the way they use it.

864
00:53:44,390 --> 00:53:44,858
Yeah?

865
00:53:44,858 --> 00:53:46,108
AUDIENCE: [UNINTELLIGIBLE]

866
00:53:49,070 --> 00:53:49,760
PROFESSOR: Yeah.

867
00:53:49,760 --> 00:53:50,060
Right.

868
00:53:50,060 --> 00:53:53,670
So this chart makes
it very clear.

869
00:53:53,670 --> 00:53:55,740
If you're going to get this
tornado effect, it's because

870
00:53:55,740 --> 00:53:57,490
you have some gap in here.

871
00:53:57,490 --> 00:53:59,480
The bigger the gap, the further
away you are from

872
00:53:59,480 --> 00:54:01,630
capacity, quite quantitatively.

873
00:54:06,886 --> 00:54:08,782
So I just--

874
00:54:08,782 --> 00:54:11,380
this is the first year I've been
able to get this far in

875
00:54:11,380 --> 00:54:13,150
the course, and I think
this is very much

876
00:54:13,150 --> 00:54:17,530
worth presenting because--

877
00:54:17,530 --> 00:54:18,650
look at what's happened here.

878
00:54:18,650 --> 00:54:23,230
At least for one channel, after
50 years of work in

879
00:54:23,230 --> 00:54:29,040
trying to get to Shannon's
channel capacity, around 1995

880
00:54:29,040 --> 00:54:32,960
or so, people finally figured
out a way of constructing a

881
00:54:32,960 --> 00:54:35,990
code and a decoding algorithm
that in fact has linear

882
00:54:35,990 --> 00:54:40,630
complexity, and can get as close
to channel capacity as

883
00:54:40,630 --> 00:54:42,890
you like in a very
feasible way, at

884
00:54:42,890 --> 00:54:46,300
least for this channel.

885
00:54:46,300 --> 00:54:50,160
So that's really where we want
to end the story in this

886
00:54:50,160 --> 00:54:52,480
class, because the whole class
has been about getting to

887
00:54:52,480 --> 00:54:53,290
channel capacity.

888
00:54:53,290 --> 00:54:56,350
Well, what about
other channels?

889
00:54:56,350 --> 00:55:00,230
What about channels
with errors here?

890
00:55:00,230 --> 00:55:13,850
So let's go to the symmetric
input binary

891
00:55:13,850 --> 00:55:20,700
channel, which I--

892
00:55:23,490 --> 00:55:27,760
symmetric, sorry-- symmetric
binary input channel.

893
00:55:27,760 --> 00:55:30,160
This is not standardized.

894
00:55:30,160 --> 00:55:35,810
The problem is, what you really
want to say is the

895
00:55:35,810 --> 00:55:38,620
binary symmetric channel, except
that term is already

896
00:55:38,620 --> 00:55:41,940
taken, so you've got to
say something else.

897
00:55:41,940 --> 00:55:44,510
I say symmetric binary
input channel.

898
00:55:44,510 --> 00:55:45,775
You'll see other things
in the literature.

899
00:55:48,610 --> 00:55:54,840
This channel has 2 inputs: 0
and 1, and it has as many

900
00:55:54,840 --> 00:55:55,950
outputs as you like.

901
00:55:55,950 --> 00:55:59,290
It might have an
erasure output.

902
00:55:59,290 --> 00:56:02,120
And the key thing about the
erasure output is that the

903
00:56:02,120 --> 00:56:04,730
probability of getting there
from either 0 or 1 is the

904
00:56:04,730 --> 00:56:07,790
same, call it p again.

905
00:56:07,790 --> 00:56:11,610
And so the a posteriori
probability, let's write the

906
00:56:11,610 --> 00:56:14,830
APPs by each of these.

907
00:56:14,830 --> 00:56:16,950
The erasure output is always
going to be a state of

908
00:56:16,950 --> 00:56:19,510
complete ignorance,
you don't know.

909
00:56:19,510 --> 00:56:22,010
So there might be one output
like that, and then there will

910
00:56:22,010 --> 00:56:28,150
be other outputs here
that occur in pairs.

911
00:56:28,150 --> 00:56:31,650
And the pairs are always going
to have the character that

912
00:56:31,650 --> 00:56:35,730
their APP is going
to be 1 minus--

913
00:56:35,730 --> 00:56:37,280
I've used p excessively here.

914
00:56:37,280 --> 00:56:41,080
Let me take it off of here
and use it here--

915
00:56:41,080 --> 00:56:44,330
for a typical other pair, you're
going to have 1 minus

916
00:56:44,330 --> 00:56:48,570
pp, or p 1 minus p.

917
00:56:48,570 --> 00:56:50,680
In other words, just looking
at these 2 outputs, it's a

918
00:56:50,680 --> 00:56:53,380
binary symmetric channel.

919
00:56:53,380 --> 00:56:56,080
The probability of p
of cross over and 1

920
00:56:56,080 --> 00:56:59,890
minus p of being correct.

921
00:56:59,890 --> 00:57:03,140
And we may have pairs that are
pretty unreliable where p is

922
00:57:03,140 --> 00:57:05,580
close to 1/2, and we
may have pairs that

923
00:57:05,580 --> 00:57:07,460
are extremely reliable.

924
00:57:07,460 --> 00:57:14,110
So this 1 minus p prime, p
prime, where p prime might be

925
00:57:14,110 --> 00:57:17,550
very close to 0.

926
00:57:17,550 --> 00:57:20,820
But the point is, the outputs
always occur in these pairs.

927
00:57:20,820 --> 00:57:25,750
The output space can be
partitioned into pairs such

928
00:57:25,750 --> 00:57:28,780
that, for each pair, you have a
binary symmetric channel, or

929
00:57:28,780 --> 00:57:32,580
you might have this singleton,
which is an erasure.

930
00:57:32,580 --> 00:57:37,030
And this is, of course, what we
have for the binary input

931
00:57:37,030 --> 00:57:39,170
additive white Gaussian
noise channel.

932
00:57:39,170 --> 00:57:45,440
We have 2 inputs, and now we
have an output which is the

933
00:57:45,440 --> 00:57:49,760
complete real line, which has
a distribution like this.

934
00:57:49,760 --> 00:57:53,555
But in this case, 0
is the erasure.

935
00:57:53,555 --> 00:57:57,080
If we get a 0, then the
probability of the APP message

936
00:57:57,080 --> 00:57:59,140
is (1/2,1/2).

937
00:57:59,140 --> 00:58:02,620
And the pairs are
plus or minus y.

938
00:58:02,620 --> 00:58:11,090
If we get to see y, then the
probability of y given 0, or

939
00:58:11,090 --> 00:58:14,920
given one, that's the same pair
as the probability of

940
00:58:14,920 --> 00:58:16,910
minus y given--

941
00:58:16,910 --> 00:58:20,740
this is, of course, minus 1,
plus 1 for my 2 possible

942
00:58:20,740 --> 00:58:22,950
transmissions here.

943
00:58:22,950 --> 00:58:25,800
Point is, binary input added
white Gaussian noise channel

944
00:58:25,800 --> 00:58:27,120
is in this class.

945
00:58:27,120 --> 00:58:29,517
It has a continuous output
rather than a discrete output.

946
00:58:32,200 --> 00:58:34,630
But there's a key symmetry
property here.

947
00:58:34,630 --> 00:58:39,720
Basically, if you exchange
0 for 1, nothing changes.

948
00:58:39,720 --> 00:58:42,070
All right, so the symmetry
between 0 and 1.

949
00:58:42,070 --> 00:58:44,830
That's why it's called
a symmetric channel.

950
00:58:44,830 --> 00:58:49,390
That means you can easily prove
that for the capacity

951
00:58:49,390 --> 00:58:53,630
achieving input distribution is
always (1/2,1/2), for any

952
00:58:53,630 --> 00:58:54,450
such channel.

953
00:58:54,450 --> 00:58:56,880
If you've taken information
theory, you've seen this

954
00:58:56,880 --> 00:58:59,050
demonstrated.

955
00:58:59,050 --> 00:59:04,370
And this has the important
implication that you can use

956
00:59:04,370 --> 00:59:08,320
linear codes on any symmetric
binary input channel without

957
00:59:08,320 --> 00:59:09,865
loss of channel capacity.

958
00:59:13,200 --> 00:59:15,020
Linear codes achieve capacity.

959
00:59:19,450 --> 00:59:22,640
OK, whereas, of course, if this
weren't (1/2,1/2), then

960
00:59:22,640 --> 00:59:24,480
linear codes couldn't possibly
achieve capacity.

961
00:59:32,190 --> 00:59:33,960
Suppose you have
such a channel.

962
00:59:33,960 --> 00:59:38,450
What's the sum product
updates?

963
00:59:38,450 --> 00:59:43,860
The sum product updates become
more complicated.

964
00:59:43,860 --> 00:59:46,400
They're really not hard
for the equality sign.

965
00:59:46,400 --> 00:59:50,560
You remember for a repetition
node, the sum product update

966
00:59:50,560 --> 00:59:54,610
is just the product of basically
the APPs coming in

967
00:59:54,610 --> 00:59:56,670
or the APPs going out.

968
00:59:56,670 --> 00:59:58,920
So all we've got to do
is take the product.

969
00:59:58,920 --> 01:00:02,410
It'll turn out the messages in
this case are always of the

970
01:00:02,410 --> 01:00:06,500
form p, 1 minus p--

971
01:00:06,500 --> 01:00:09,420
of course, because they're
binary, and so

972
01:00:09,420 --> 01:00:11,040
has to be like this--

973
01:00:11,040 --> 01:00:13,830
so we really just need
a single parameter p.

974
01:00:13,830 --> 01:00:18,120
We multiply all the p's, and
that normalize correctly, and

975
01:00:18,120 --> 01:00:19,660
that'll be the output.

976
01:00:22,410 --> 01:00:26,270
For the update here, I'm sorry
I don't have time to talk

977
01:00:26,270 --> 01:00:32,990
about it in class, but there's
a clever little procedure

978
01:00:32,990 --> 01:00:35,790
which basically says take
the Hadamard Transform

979
01:00:35,790 --> 01:00:37,330
of p, 1 minus p.

980
01:00:37,330 --> 01:00:40,750
The Hadamard Transform in
general says, convert this to

981
01:00:40,750 --> 01:00:44,160
the pair of a plus
b, a minus b.

982
01:00:44,160 --> 01:00:50,110
So in this case, we convert it
to a plus b is always 1, and a

983
01:00:50,110 --> 01:00:56,990
minus b is, in this
case, 2p minus 1.

984
01:00:56,990 --> 01:00:59,440
Works out better, turns
out this is actually

985
01:00:59,440 --> 01:01:02,400
a likelihood ratio.

986
01:01:02,400 --> 01:01:04,970
Take the Hadamard Transform,
then you can use the same

987
01:01:04,970 --> 01:01:07,250
product update rule as
you used up here.

988
01:01:10,220 --> 01:01:19,860
So do the repetition node
updates, which is easy--

989
01:01:19,860 --> 01:01:23,980
so it says just multiply all the
inputs component-wise in

990
01:01:23,980 --> 01:01:27,880
this vector, and then take the
Hadamard Transform again to

991
01:01:27,880 --> 01:01:33,400
get your time domain
or primal domain,

992
01:01:33,400 --> 01:01:34,820
rather than dual domain.

993
01:01:34,820 --> 01:01:36,920
So you work in the dual
domain, rather

994
01:01:36,920 --> 01:01:38,560
than the primal domain.

995
01:01:38,560 --> 01:01:40,630
Again, I'm sorry.

996
01:01:40,630 --> 01:01:42,390
You got a homework problem on
it, after you've done the

997
01:01:42,390 --> 01:01:45,590
homework problem, you'll
understand this.

998
01:01:45,590 --> 01:01:51,410
And this turns out to
involve hyperbolic

999
01:01:51,410 --> 01:01:53,860
tangents to do these.

1000
01:01:53,860 --> 01:01:57,770
These Hadamard Transforms turn
out to be taking hyperbolic

1001
01:01:57,770 --> 01:02:01,010
tangents, and this is called the
hyperbolic tangent rule,

1002
01:02:01,010 --> 01:02:02,630
the tanh rule.

1003
01:02:02,630 --> 01:02:05,790
So there's a simple way to do
updates in general for any of

1004
01:02:05,790 --> 01:02:07,040
these channels.

1005
01:02:09,740 --> 01:02:13,400
Now, you can do the
same kind of

1006
01:02:13,400 --> 01:02:17,970
analysis, but what's different?

1007
01:02:17,970 --> 01:02:22,280
For the erasure channel, we only
had 2 types of messages,

1008
01:02:22,280 --> 01:02:25,620
known or erased, and all we
really had to do is keep track

1009
01:02:25,620 --> 01:02:28,980
of what's the probability of the
erasure type of message,

1010
01:02:28,980 --> 01:02:32,690
or 1 minus this probability,
it doesn't matter.

1011
01:02:32,690 --> 01:02:34,670
So that's why I said it
was one-dimensional.

1012
01:02:37,430 --> 01:02:45,310
For the symmetric binary input
channel, in general, you can

1013
01:02:45,310 --> 01:02:47,880
have any APP vector here.

1014
01:02:47,880 --> 01:02:49,840
This is a single parameter
vector.

1015
01:02:49,840 --> 01:02:54,780
It's parameterized by p, or by
the likelihood ratio, or by

1016
01:02:54,780 --> 01:02:56,690
the log likelihood ratio.

1017
01:02:56,690 --> 01:02:58,340
There are various ways
to parameterize it.

1018
01:02:58,340 --> 01:03:01,750
But in any case, a single number
tells you what the APP

1019
01:03:01,750 --> 01:03:04,220
message is.

1020
01:03:04,220 --> 01:03:08,676
And so at this point-- or I
guess, better looking at it in

1021
01:03:08,676 --> 01:03:10,310
the competition tree--

1022
01:03:10,310 --> 01:03:12,790
at each point, instead of having
a single number, we

1023
01:03:12,790 --> 01:03:16,600
have a probability distribution
on p.

1024
01:03:16,600 --> 01:03:22,840
So we get some probability
distribution on p, pp of p,

1025
01:03:22,840 --> 01:03:27,640
that characterizes where
you are at this time.

1026
01:03:27,640 --> 01:03:31,490
Coming off the channel,
initially, the probability

1027
01:03:31,490 --> 01:03:35,690
distribution on p might be equal
to y, I think it is,

1028
01:03:35,690 --> 01:03:39,370
actually, or p to the minus y,
and you get some probability

1029
01:03:39,370 --> 01:03:41,700
distribution on what p is.

1030
01:03:45,070 --> 01:03:47,720
By the way, again because of
symmetry, you can always

1031
01:03:47,720 --> 01:03:51,760
assume that the all-zero vector
was sent in your code.

1032
01:03:51,760 --> 01:03:55,770
It doesn't matter which of your
code words is sent, since

1033
01:03:55,770 --> 01:03:57,430
everything is symmetrical.

1034
01:03:57,430 --> 01:04:00,120
So you can do all your analysis
assuming the all-zero

1035
01:04:00,120 --> 01:04:01,620
code word was sent.

1036
01:04:01,620 --> 01:04:04,060
This simplifies things
a lot, too.

1037
01:04:04,060 --> 01:04:07,840
p then becomes the probability
which--

1038
01:04:07,840 --> 01:04:10,270
well, I guess I've
got it backwards.

1039
01:04:10,270 --> 01:04:15,870
Should be 1 minus pp, because
p then becomes the

1040
01:04:15,870 --> 01:04:18,398
probability.

1041
01:04:18,398 --> 01:04:22,140
If the assumed probability of
the input is a 1, in other

1042
01:04:22,140 --> 01:04:24,640
words, the probability
that your current

1043
01:04:24,640 --> 01:04:27,470
guess would be wrong--

1044
01:04:27,470 --> 01:04:29,000
I'm not saying that well.

1045
01:04:29,000 --> 01:04:31,640
Anyway, you get some
distribution of p.

1046
01:04:31,640 --> 01:04:34,540
Let me just draw it like that.

1047
01:04:34,540 --> 01:04:37,120
So here's pp of p.

1048
01:04:37,120 --> 01:04:40,210
There's probability
distribution.

1049
01:04:40,210 --> 01:04:42,560
And again, we'll draw it
going from 1 to 0.

1050
01:04:45,860 --> 01:04:47,180
So that doesn't go out here.

1051
01:04:50,550 --> 01:04:56,700
OK, with more effort, you can
again see what the effect of

1052
01:04:56,700 --> 01:04:58,480
the update rule is
going to be.

1053
01:04:58,480 --> 01:05:02,400
For each iteration, you have a
certain input distribution on

1054
01:05:02,400 --> 01:05:03,490
all these lines.

1055
01:05:03,490 --> 01:05:06,190
Again, under the independence
assumption, you get

1056
01:05:06,190 --> 01:05:08,580
independently--

1057
01:05:08,580 --> 01:05:12,640
you get a distribution for the
APP parameter p on each of

1058
01:05:12,640 --> 01:05:13,980
these lines.

1059
01:05:13,980 --> 01:05:14,780
That leads--

1060
01:05:14,780 --> 01:05:17,380
you can then calculate what the
distribution-- or simulate

1061
01:05:17,380 --> 01:05:21,380
what it is on the output line,
just by seeing what's the

1062
01:05:21,380 --> 01:05:24,480
effect of applying the
sum product rule.

1063
01:05:24,480 --> 01:05:27,300
It's a much more elaborate
calculation, but you can do

1064
01:05:27,300 --> 01:05:30,870
it, or you can do it up to
some degree of precision.

1065
01:05:30,870 --> 01:05:35,950
This you can't do exactly, but
you can do it to fourteen bits

1066
01:05:35,950 --> 01:05:38,800
of precision if you like.

1067
01:05:38,800 --> 01:05:44,290
And so again, you can work
through something that amounts

1068
01:05:44,290 --> 01:05:51,580
to plotting the progress of the
iteration through here, up

1069
01:05:51,580 --> 01:05:54,860
to any degree of precision
you want.

1070
01:05:54,860 --> 01:05:59,980
So again, you can determine
whether it succeeds or fails,

1071
01:05:59,980 --> 01:06:02,900
again, for regular or irregular
low-density parity

1072
01:06:02,900 --> 01:06:05,590
check codes.

1073
01:06:05,590 --> 01:06:07,340
In general, it's better
to make it irregular.

1074
01:06:07,340 --> 01:06:11,580
You could make it as irregular
as you like.

1075
01:06:11,580 --> 01:06:14,380
And so you can see that this
could involve a lot of

1076
01:06:14,380 --> 01:06:21,000
computer time to optimize
everything, but at the end of

1077
01:06:21,000 --> 01:06:27,080
the day, it's basically a
similar kind of hill climbing,

1078
01:06:27,080 --> 01:06:31,950
curve fitting exercise, where
ultimately on any of these

1079
01:06:31,950 --> 01:06:39,180
binary input symmetric channels,
you can get as close

1080
01:06:39,180 --> 01:06:41,350
as you want to capacity.

1081
01:06:41,350 --> 01:06:44,090
In the very first lecture, I
showed you what Sae-Young

1082
01:06:44,090 --> 01:06:46,460
Chung achieved in his thesis.

1083
01:06:46,460 --> 01:06:49,740
He got the binary input
additive white

1084
01:06:49,740 --> 01:06:51,420
Gaussian noise channel.

1085
01:06:51,420 --> 01:06:54,980
He got under the assumption of

1086
01:06:54,980 --> 01:06:56,990
asymptotically long random codes.

1087
01:06:56,990 --> 01:07:02,290
He got within 0.0045 dB
of channel capacity.

1088
01:07:02,290 --> 01:07:05,990
And then for a more reasonable
number, like a block length of

1089
01:07:05,990 --> 01:07:10,970
10 to the seventh, he
got within 0.040

1090
01:07:10,970 --> 01:07:13,500
dB of channel capacity.

1091
01:07:13,500 --> 01:07:15,200
Now, that's still a
longer code than

1092
01:07:15,200 --> 01:07:16,110
anybody's going to use.

1093
01:07:16,110 --> 01:07:20,730
It's a little bit of a stunt,
but I think his work convinced

1094
01:07:20,730 --> 01:07:25,070
everybody that we finally had
gotten to channel capacity.

1095
01:07:25,070 --> 01:07:27,300
OK, the Eta Kappa Nu
person is here.

1096
01:07:27,300 --> 01:07:30,910
Please help her out, and
we'll see you Monday.