1
00:00:00,017 --> 00:00:03,457
It was quick. Excellent. Yes, I'm very happy to be here.

2
00:00:05,757 --> 00:00:10,537
So Lydia, yesterday at the dinner, asked me what I thought when I got the invitation

3
00:00:10,537 --> 00:00:13,257
to speak here. I'm hearing myself.

4
00:00:14,117 --> 00:00:21,117
Anyways. And the answer is I'm quite happy to do these things and to speak to

5
00:00:21,117 --> 00:00:26,417
people kind of outside of the research community about these large language

6
00:00:26,417 --> 00:00:28,617
models because I feel a bit of a responsibility.

7
00:00:28,617 --> 00:00:35,177
They are the product of exactly my research field, and there's a lot of justified confusion out there.

8
00:00:35,337 --> 00:00:40,337
So it is good to get into a conversation with people who might be like you,

9
00:00:40,377 --> 00:00:44,857
who might be shaping how these things are actually going to be used, if at all.

10
00:00:45,537 --> 00:00:50,557
So I have 10 minutes. So this is going to be very high level, obviously.

11
00:00:51,217 --> 00:00:58,697
And what I thought I'd do is, in the first part, I talk a bit about the history, about how we got here.

12
00:00:59,977 --> 00:01:03,117
And so in the field, we had a little bit of a heads up.

13
00:01:03,917 --> 00:01:08,217
We've been seeing this coming for a couple of years, but not a lot.

14
00:01:08,337 --> 00:01:12,177
So a lot of it was surprising for us as well.

15
00:01:12,397 --> 00:01:18,637
And in the second part, I will talk a bit about our current understanding of

16
00:01:18,637 --> 00:01:23,677
LLMs and maybe a positive path forward from it.

17
00:01:25,117 --> 00:01:29,777
Okay, so here are the past three and a half decades.

18
00:01:32,042 --> 00:01:36,402
The history of the field is longer, but this happens to be the time that I was

19
00:01:36,402 --> 00:01:42,242
involved in this field, first as a student and then as a researcher in various capacities.

20
00:01:43,442 --> 00:01:50,022
And what's interesting about this is that I have witnessed already two paradigm

21
00:01:50,022 --> 00:01:55,162
shifts, two very different ways of doing things, which brought us here.

22
00:01:55,222 --> 00:02:01,722
So in the 1990s, This was very much about symbolic methods and knowledge representation.

23
00:02:02,082 --> 00:02:06,762
So what we would do is to process language with computers, we would go to our

24
00:02:06,762 --> 00:02:13,322
linguist friends, get grammars from them, and then write programs that process these grammars.

25
00:02:13,362 --> 00:02:19,262
And these are programs that are very much recognizable to you as well because

26
00:02:19,262 --> 00:02:24,502
they are fully thought through these programs, hopefully at least, ideally at least.

27
00:02:25,182 --> 00:02:27,822
I did around that time in the

28
00:02:27,822 --> 00:02:31,322
late 1990s I did train my first neural network I took

29
00:02:31,322 --> 00:02:34,262
a class on neural networks the University of Bonn where

30
00:02:34,262 --> 00:02:37,602
I studied computer science at the time so

31
00:02:37,602 --> 00:02:44,462
and I also trained my first language model in the late 1990s I think at an exchange

32
00:02:44,462 --> 00:02:50,142
semester in Edinburgh so these are not new techniques right these are fairly

33
00:02:50,142 --> 00:02:54,342
old techniques but they had a very They were very niche at the time.

34
00:02:54,362 --> 00:02:56,302
They had a very specialized use.

35
00:02:57,082 --> 00:03:00,722
I would love to be able to tell you that I stuck with these methods,

36
00:03:00,802 --> 00:03:06,342
like the fathers of deep learning, and reaped the benefit 20 years later.

37
00:03:06,502 --> 00:03:11,642
But I did not. I worked with mainstream methods of the field,

38
00:03:11,722 --> 00:03:13,762
which were symbolic methods at first.

39
00:03:14,242 --> 00:03:20,302
And during the 2000s, we started to use statistical methods or machine learning

40
00:03:20,302 --> 00:03:24,522
methods. methods that basically learn parts of the.

41
00:03:26,632 --> 00:03:31,032
Of the module, of the processing module. But the point is that there was still

42
00:03:31,032 --> 00:03:33,112
symbolic knowledge representations.

43
00:03:33,852 --> 00:03:38,872
The way that you got these representations, it helped you to process language,

44
00:03:38,932 --> 00:03:39,852
to understand language.

45
00:03:40,072 --> 00:03:47,832
It was machine learned, but the representations themselves were designed by humans to be clever.

46
00:03:49,272 --> 00:03:59,072
I myself got an induction in the latest paradigm in 2014 when I did a sabbatical

47
00:03:59,072 --> 00:04:01,732
at Microsoft Research in Seattle.

48
00:04:02,612 --> 00:04:08,052
And it was crazy. There was really a buzz in the air and people were super stoked

49
00:04:08,052 --> 00:04:11,012
by the first neural methods that started to work.

50
00:04:11,312 --> 00:04:17,252
And at that time, that was word embeddings. So there's a particular way of representing

51
00:04:17,252 --> 00:04:21,112
the semantics, the meaning of words.

52
00:04:21,992 --> 00:04:27,652
But what was new about it was that the representations weren't designed.

53
00:04:27,892 --> 00:04:30,732
The representations were machine-learned as well.

54
00:04:31,072 --> 00:04:36,432
The machine basically came up with the best representation that was best for the task.

55
00:04:36,552 --> 00:04:39,932
For a very simple task, but it turned out to be very general.

56
00:04:39,932 --> 00:04:45,732
And that kind of set the mood, set the scene for where we are now.

57
00:04:45,992 --> 00:04:50,092
So the first ingredient that changed many things was representation learning.

58
00:04:51,552 --> 00:04:57,972
The representations are built by the machine itself. And then with that comes end-to-end systems.

59
00:04:58,712 --> 00:05:03,792
So instead of building modular computer systems, what you do is you really train

60
00:05:03,792 --> 00:05:07,532
for the task and let the machine do all the intermediate steps itself.

61
00:05:09,932 --> 00:05:14,072
And then in 2018, the BERT paper came out.

62
00:05:14,752 --> 00:05:18,672
Just to get an idea of how many people know something about this.

63
00:05:18,732 --> 00:05:19,752
Does anyone know this paper?

64
00:05:20,352 --> 00:05:25,392
Yeah, a good few people know this paper. Okay, so this was the first paper that

65
00:05:25,392 --> 00:05:27,492
really showed that transformers would work.

66
00:05:28,652 --> 00:05:34,032
So this was the transformers paper was before that. But this was the first paper

67
00:05:34,032 --> 00:05:36,192
that applied it and got fantastic results.

68
00:05:37,432 --> 00:05:41,192
I was at a conference. This paper came out in 2018, late 2018.

69
00:05:41,532 --> 00:05:47,392
I was at a conference a couple of weeks later, and people were kind of stunned.

70
00:05:47,712 --> 00:05:50,712
People were giving their presentations as they planned.

71
00:05:50,912 --> 00:05:55,612
We were showing their results, and then they said, okay, when we wrote this

72
00:05:55,612 --> 00:05:58,632
paper, these were the state-of-the-art results.

73
00:05:58,872 --> 00:06:03,852
We tried this new bird model, and we got that this bird model is 10% better.

74
00:06:04,052 --> 00:06:08,772
It blows our model out of the water. So that was really, that was a moment there.

75
00:06:09,472 --> 00:06:15,732
And by that time, then the race was on to develop, to do something with this.

76
00:06:17,126 --> 00:06:25,306
For various reasons, as we all know, OpenAI was the team that produced the first,

77
00:06:26,366 --> 00:06:32,266
model or product maybe that caught people's attention outside of the field.

78
00:06:32,986 --> 00:06:37,486
Many other people tried as well. It's a little bit of a puzzle why Google didn't get there.

79
00:06:37,606 --> 00:06:44,566
But anyway, Anyway, so that's where we are now and as on the left-hand side,

80
00:06:44,766 --> 00:06:47,286
I subsumed this as generalist models.

81
00:06:47,926 --> 00:06:52,906
So the really new thing is now that we have models that aren't even trained

82
00:06:52,906 --> 00:06:54,986
for one particular task.

83
00:06:55,126 --> 00:06:58,726
Well, they are trained for one particular task, namely next-word prediction,

84
00:06:59,006 --> 00:07:04,606
but it turns out that this is super powerful that you can turn them into generalist models.

85
00:07:07,926 --> 00:07:12,506
And this really is a new situation, so there has been a paradigm shift and there

86
00:07:12,506 --> 00:07:16,306
has been kind of, okay, what do we do now?

87
00:07:16,426 --> 00:07:18,766
Where do we go now with this field?

88
00:07:18,906 --> 00:07:25,046
And finally, how can we build applications with this? How can we make this useful for people?

89
00:07:27,866 --> 00:07:34,766
Okay, so this is my part one. I'm not really sure what the moral is of this,

90
00:07:35,926 --> 00:07:41,546
because, unfortunately, these methods are a little bit stupid.

91
00:07:42,286 --> 00:07:46,786
It is a little bit insulting, actually, that these very simple methods,

92
00:07:46,926 --> 00:07:51,566
when scaled up enormously, work that well.

93
00:07:52,126 --> 00:07:55,966
But that's where we are, and now we need to do something with this situation.

94
00:07:57,486 --> 00:08:02,726
Okay, so in my research group, we work on language use a lot.

95
00:08:02,726 --> 00:08:08,266
We think about the pragmatics of language, about the subtle cues that change

96
00:08:08,266 --> 00:08:10,526
the meaning and contextualize the meaning.

97
00:08:10,986 --> 00:08:16,426
And we have a lot of projects now trying to understand to what degree LLMs can model this.

98
00:08:16,606 --> 00:08:21,006
I want to give a shout out to one project we're doing.

99
00:08:21,006 --> 00:08:28,546
We have a benchmarking tool for LLMs that tests pragmatic abilities by letting

100
00:08:28,546 --> 00:08:33,466
them play dialogue, conversational games kind of against themselves.

101
00:08:33,826 --> 00:08:39,046
But that's not what I want to talk about in the remaining whatever many minutes.

102
00:08:41,466 --> 00:08:47,426
We are also doing thinking about the role of LLMs that they can and cannot play

103
00:08:47,426 --> 00:08:52,646
in technology. And these are two papers that I want to very briefly touch on

104
00:08:52,646 --> 00:08:55,286
on the remaining slides.

105
00:08:57,267 --> 00:09:03,767
Okay, so here are three observations that we maybe can discuss later and do something with.

106
00:09:04,927 --> 00:09:09,087
The first one is that LLMs have decoupled fluency from expertise.

107
00:09:09,527 --> 00:09:11,387
And I guess you guys all know this.

108
00:09:12,167 --> 00:09:16,767
But it is surprising how many people sort of in the general population still

109
00:09:16,767 --> 00:09:19,047
have not fully understood this, right?

110
00:09:19,187 --> 00:09:24,527
Because it used to be the case that if someone produces flawless text on a subject,

111
00:09:24,527 --> 00:09:29,147
It used to be the case that there are also experts in the subject matter,

112
00:09:29,367 --> 00:09:32,987
but this is very much decoupled for technical reasons.

113
00:09:33,347 --> 00:09:38,447
The training objective of these models rewards them for producing the right

114
00:09:38,447 --> 00:09:44,127
words and the right phrases, but it has no access to the underlying causes for

115
00:09:44,127 --> 00:09:46,507
why you would want to produce particular words.

116
00:09:46,507 --> 00:09:53,167
So it gets the tone very right of almost anything you ask it to do.

117
00:09:53,907 --> 00:09:57,047
But there are no guarantees on whether it will get the content right.

118
00:09:57,667 --> 00:10:00,107
And that is a bit of a problem, perhaps.

119
00:10:01,647 --> 00:10:07,907
Okay, second point. Computers cannot assert, even if the accuracy were a lot

120
00:10:07,907 --> 00:10:12,587
higher and hallucinations were very aware, well, they are still fundamentally

121
00:10:12,587 --> 00:10:16,527
different from human speakers because they don't care.

122
00:10:16,927 --> 00:10:20,027
There's no one there who could care about what is said, right?

123
00:10:20,707 --> 00:10:25,127
And my conclusion, at least from this, is that humans need to retain semantic

124
00:10:25,127 --> 00:10:26,427
control over the output.

125
00:10:26,947 --> 00:10:32,567
Basically, you need to know, you need to be able to understand whatever you're producing with LLMs.

126
00:10:33,247 --> 00:10:38,887
And that restricts the use cases quite a bit, I think. But I think that's non-negotiable

127
00:10:38,887 --> 00:10:39,627
with these technologies.

128
00:10:40,387 --> 00:10:45,307
And the third point is that understanding LLMs as language-controllable function

129
00:10:45,307 --> 00:10:49,967
approximators yields an understanding with which we can do something.

130
00:10:50,927 --> 00:10:56,027
So the idea here is that instead of you writing the code for a function,

131
00:10:56,267 --> 00:11:01,827
you search for this function in the LLM, the latent space of the LLM,

132
00:11:01,887 --> 00:11:06,927
and then you treat the LLM indeed as having implemented this function.

133
00:11:06,927 --> 00:11:10,067
But it's only approximating this function, right?

134
00:11:10,367 --> 00:11:13,007
And that interesting challenges come from this.

135
00:11:14,307 --> 00:11:20,107
Okay, I'm already over time. I'm gonna skip over a lot, but I want to end a

136
00:11:20,107 --> 00:11:26,147
little bit on what I think is a positive way forward for using AI technologies.

137
00:11:26,867 --> 00:11:29,567
And maybe we can discuss this in the panel.

138
00:11:30,527 --> 00:11:35,427
Okay, so in a positive future, In the future, what has happened is that users

139
00:11:35,427 --> 00:11:41,807
and buyers of this technology have developed a good mental model of this technology.

140
00:11:43,539 --> 00:11:47,179
And have avoided all the anthropomorphization that is out there.

141
00:11:47,359 --> 00:11:51,739
So these things are not interns, they're not coworkers, they're not friends.

142
00:11:52,199 --> 00:11:53,739
They're programs, right?

143
00:11:54,519 --> 00:11:58,939
They're programs that have to work and can be evaluated as such.

144
00:11:59,079 --> 00:12:02,059
And if they don't work, you don't use them or you fix them, right?

145
00:12:03,419 --> 00:12:09,579
And the onus here is, I guess, on us as educators, really driving home this

146
00:12:09,579 --> 00:12:12,879
message to people that this is what these things are.

147
00:12:13,939 --> 00:12:20,479
De-skilling has been avoided, right? So there is a danger when you sort of naively

148
00:12:20,479 --> 00:12:25,819
build these techniques into products that you are de-skilling people because

149
00:12:25,819 --> 00:12:31,739
they rely on these products too much and forget how to do the tasks themselves.

150
00:12:33,539 --> 00:12:37,259
And to avoid this, some of this is on you.

151
00:12:37,799 --> 00:12:41,799
I think you can, when you implement these systems, you can make engineering

152
00:12:41,799 --> 00:12:43,819
design decisions that avoid this.

153
00:12:44,419 --> 00:12:48,239
Some of this is on us, it's on the research side.

154
00:12:48,659 --> 00:12:53,859
We need to do more research on interpretability and some is on the regulation side.

155
00:12:55,519 --> 00:12:59,019
Okay, another point, responsibility of loading has been avoided.

156
00:13:00,279 --> 00:13:04,279
So the tools are designed in such a way that people actually accept the fact

157
00:13:04,279 --> 00:13:05,979
that they have to have semantic control.

158
00:13:06,579 --> 00:13:13,059
Again, there's a design component to this. Maybe you have to design friction into the tools, right?

159
00:13:13,899 --> 00:13:20,019
So before sending off the email that you had the LLM formulate for you,

160
00:13:21,039 --> 00:13:25,039
get some proof from the user that they've actually read what has been produced.

161
00:13:25,379 --> 00:13:30,099
Something like this. This might be a naive idea, but this is for you to think about.

162
00:13:31,399 --> 00:13:35,099
Okay, what else do I have? Cost externalization has been reverted.

163
00:13:36,499 --> 00:13:39,919
These things need to be as expensive as they are.

164
00:13:41,679 --> 00:13:45,819
We know this from many areas. This holds for traffic as well.

165
00:13:45,819 --> 00:13:47,259
This holds for manufacturing.

166
00:13:47,699 --> 00:13:50,939
This holds for these things as well. They have to have the real costs,

167
00:13:51,059 --> 00:13:55,939
which might drive some use cases out of being economically viable.

168
00:13:56,239 --> 00:13:58,139
But then so be it, right?

169
00:13:59,039 --> 00:14:02,759
And the last point maybe is something that interests you as well.

170
00:14:02,759 --> 00:14:06,319
Well, some form of cultural sustainability has to be achieved.

171
00:14:07,159 --> 00:14:13,959
So maybe we can come up with models where, so one part might be licensing training

172
00:14:13,959 --> 00:14:21,279
data and paying the creators of the data that these models were trained on in that way.

173
00:14:21,439 --> 00:14:28,639
But there might be room for creative models, like where each instance where

174
00:14:28,639 --> 00:14:34,159
something is generated that can very clearly be traced back to source material,

175
00:14:34,499 --> 00:14:37,879
the creator of the source material is paid in some way.

176
00:14:39,803 --> 00:14:44,143
This is unlike, obviously, a real world. I mean, normally, if a teacher teaches

177
00:14:44,143 --> 00:14:49,783
you something, it's not the textbook author who gets money, the author of the

178
00:14:49,783 --> 00:14:51,963
textbook that the teacher learned this from.

179
00:14:52,123 --> 00:14:55,303
But these are new things, so new models should be tried out.

180
00:14:55,583 --> 00:15:03,003
Okay, with this, I think we can go over to the panel discussion. Okay, thank you.

181
00:15:13,523 --> 00:15:14,383
Thank you so much.

182
00:15:18,183 --> 00:15:20,183
I am going to leave it on this. Yes, coming.

183
00:15:23,123 --> 00:15:28,483
We are going to have a bit of a discussion here.

184
00:15:29,063 --> 00:15:32,443
And then also have time for audience questions.

185
00:15:33,363 --> 00:15:40,503
And my first question for both of you is a friend of mine a while ago said.

186
00:15:42,763 --> 00:15:50,803
AI, thank you AI is always that which we don't quite understand right now and

187
00:15:50,803 --> 00:15:55,303
what was AI 10 years ago we today maybe no longer consider AI,

188
00:15:56,063 --> 00:16:01,103
and I would love to hear from you, what do you consider to be AI today day and

189
00:16:01,103 --> 00:16:03,423
what do you maybe not consider to be AI?

190
00:16:05,183 --> 00:16:05,663
Anyone?

191
00:16:07,503 --> 00:16:08,343
Why don't you stop?

192
00:16:10,285 --> 00:16:14,265
Question uh yeah so um

193
00:16:14,265 --> 00:16:17,125
i i got the question list in advance so

194
00:16:17,125 --> 00:16:24,405
i had time to think about this a little bit um i i know a variant of this um

195
00:16:24,405 --> 00:16:30,725
and it's called the uh ai paradox uh i guess and that is kind of more about

196
00:16:30,725 --> 00:16:36,505
uh goalpost shifting right so so there the complaint is This is that people,

197
00:16:36,645 --> 00:16:40,585
as soon as it starts working, people say this is not real intelligence.

198
00:16:42,025 --> 00:16:49,085
And I think I have little problems with this because I don't think human value

199
00:16:49,085 --> 00:16:51,245
derives from intelligence only.

200
00:16:51,385 --> 00:16:58,845
So I don't feel threatened by programs doing tasks that appear intelligent.

201
00:17:00,165 --> 00:17:02,885
So in that sense, it doesn't quite work for me.

202
00:17:05,245 --> 00:17:08,545
And in a more technical sense, AI is a programming technique for me.

203
00:17:08,625 --> 00:17:13,145
So what has been programmed in this particular way stays that way.

204
00:17:13,305 --> 00:17:17,825
On the contrary, I'm actually still blown away. I'm old enough to remember how

205
00:17:17,825 --> 00:17:23,165
terrible ASR was and I'm still blown away by how good it works these days.

206
00:17:23,805 --> 00:17:30,225
So in that sense, there's still magic for me. Yeah, that's the reply, I guess.

207
00:17:31,345 --> 00:17:36,625
All right, maybe I introduce myself first, because I don't think we've done

208
00:17:36,625 --> 00:17:39,685
that yet, and maybe there are some who are online who don't know.

209
00:17:40,045 --> 00:17:44,765
I'm Eike, I've been a KDE developer for about 20 years on many of our applications

210
00:17:44,765 --> 00:17:52,545
and also on the Plasma user interface, so I think intelligent is a very common.

211
00:17:54,081 --> 00:18:01,941
But when I go look for guidance, what it means to me, I end up thinking about

212
00:18:01,941 --> 00:18:03,701
what our mission is at KDE.

213
00:18:03,841 --> 00:18:07,501
And our mission is chiefly to develop end-user software.

214
00:18:07,701 --> 00:18:12,481
So what does intelligent mean to a user? What makes a user want to describe

215
00:18:12,481 --> 00:18:16,281
a computer system that they interact with as intelligent?

216
00:18:16,441 --> 00:18:24,761
And has that recently changed? And I think it has because particularly large language models,

217
00:18:25,021 --> 00:18:32,641
I think, have given computer systems in a broad way that new ability to do what

218
00:18:32,641 --> 00:18:34,401
you mean rather than what you say,

219
00:18:34,561 --> 00:18:41,001
which is not what users are used to from a computer outside of maybe some sort

220
00:18:41,001 --> 00:18:41,721
of pattern recognition,

221
00:18:41,981 --> 00:18:44,721
habit learning and making suggestions like that.

222
00:18:44,721 --> 00:18:51,081
But now you can talk to your home and say, I want my room to look like the Barbie

223
00:18:51,081 --> 00:18:53,101
movie and it turns the lights pink.

224
00:18:53,401 --> 00:18:58,881
And that's something that it just couldn't do before. And it does this in a

225
00:18:58,881 --> 00:18:59,861
very generalist fashion.

226
00:18:59,981 --> 00:19:03,541
So statistical inference, it doesn't necessarily model you.

227
00:19:03,721 --> 00:19:06,281
So it doesn't understand what your needs are individually.

228
00:19:06,561 --> 00:19:08,821
And that's where some of the sort of problems lurk.

229
00:19:09,341 --> 00:19:16,561
But I think it has raised the bar for what users expect computers to be able to do.

230
00:19:16,961 --> 00:19:23,181
And I think that is something that we definitely have to deal with.

231
00:19:23,321 --> 00:19:28,561
But also we have to, I think, communicate very clearly when we cannot actually

232
00:19:28,561 --> 00:19:31,141
match these expectations and then where to draw the line.

233
00:19:32,801 --> 00:19:36,401
That's a very good point. And I love your Barbie example. example which brings

234
00:19:36,401 --> 00:19:43,601
me to my next question where do you use AI in all the shapes and forms in your daily life right now.

235
00:19:45,502 --> 00:19:49,442
Do you turn your room, Bobby Payne? I do on occasion.

236
00:19:49,962 --> 00:20:00,382
And I think, I mean, for me, and maybe I'm weird, but I love using these models to amuse myself.

237
00:20:00,862 --> 00:20:08,442
So I often generate a funny picture just for my own purposes or to share with

238
00:20:08,442 --> 00:20:12,542
someone or a funny humorous text because it does it faster than I can do it.

239
00:20:12,642 --> 00:20:18,062
I can give it somewhat clear requirements. And, I mean, you can play with the humor of it a lot.

240
00:20:18,562 --> 00:20:23,082
ChatGPT has such a peculiar writing style that is immediately recognizable to

241
00:20:23,082 --> 00:20:26,002
anyone at this point. And you can play around with that.

242
00:20:26,722 --> 00:20:33,322
I also, and this is interesting, right? So I do use it as an accelerator when I do development work.

243
00:20:33,602 --> 00:20:40,222
And it turns out to be most useful when I use it for an application that I'm already an expert in.

244
00:20:40,222 --> 00:20:43,802
Because then I can give it very specific requirements, but I can also check

245
00:20:43,802 --> 00:20:49,002
whether it actually did what I wanted it to do. Like I can tell if this code is bad or not.

246
00:20:49,342 --> 00:20:56,942
And the usefulness sort of tends to break down as soon as I venture into a space

247
00:20:56,942 --> 00:21:01,382
where I cannot verify whether what I got is actually useful or not.

248
00:21:01,622 --> 00:21:03,782
So I think that's an interesting observation too.

249
00:21:05,982 --> 00:21:13,102
Yeah i i guess ai in your question means llms right because i guess we all use ai,

250
00:21:14,102 --> 00:21:16,962
in very many instances like when

251
00:21:16,962 --> 00:21:20,542
we turn on the dishwasher right um and

252
00:21:20,542 --> 00:21:23,762
i mean this this kind of goes back to your first question um they

253
00:21:23,762 --> 00:21:27,282
are still um there's still machine learning uh modules

254
00:21:27,282 --> 00:21:33,222
in there but for llms um yeah that's interesting i um i mean as i said this

255
00:21:33,222 --> 00:21:37,262
just comes out of my field so i feel the responsibility to try this out a little

256
00:21:37,262 --> 00:21:44,442
bit but i don't use it much for many and maybe.

257
00:21:46,799 --> 00:21:51,939
Maybe this is connected to the de-skilling. I've been in this job for a while

258
00:21:51,939 --> 00:21:55,279
now, and many of the things that I'm doing, I've been doing for a very long time.

259
00:21:55,599 --> 00:22:02,519
So I think I'm still a lot better at generating chunks of text than an LLM is,

260
00:22:02,679 --> 00:22:07,039
and it would slow me down if I had to correct the text that comes out of it.

261
00:22:07,219 --> 00:22:12,779
I have found one good use case for me, which is, so I like coding,

262
00:22:12,919 --> 00:22:17,419
but I get to do it a lot less often than you guys, I guess.

263
00:22:19,039 --> 00:22:24,799
So with a space of months in between, so when after a couple of months I get

264
00:22:24,799 --> 00:22:33,119
to look at code again in Emacs or whatever, I have forgotten the most embarrassing things.

265
00:22:33,339 --> 00:22:38,279
I've forgotten how to talk to the syntax of a for loop in Python or something.

266
00:22:38,519 --> 00:22:43,299
And it's quite good for this. So I ask models to solve my problem.

267
00:22:43,459 --> 00:22:47,579
I look at the code. I remember what, code looks like, I remember the syntax,

268
00:22:47,679 --> 00:22:48,679
and then I write it myself.

269
00:22:49,119 --> 00:22:53,939
I completely disregard the solutions. But as this kind of prompt,

270
00:22:54,139 --> 00:22:57,599
as memory aid, I guess it works quite well.

271
00:22:59,399 --> 00:23:05,519
But I'm more worried about what it does to people who are new at tasks rather

272
00:23:05,519 --> 00:23:08,399
than people who have been doing tasks for a very long time.

273
00:23:08,579 --> 00:23:14,339
I think you're onto something there. I think it's useful as a sort of checks

274
00:23:14,339 --> 00:23:15,979
and balances sometimes, and the references.

275
00:23:16,019 --> 00:23:18,319
And you write the code yourself, you also have it generated,

276
00:23:18,359 --> 00:23:21,599
you compare the two, you check whether you've possibly missed anything,

277
00:23:21,799 --> 00:23:26,219
or whether the alternative approach might be different. So I think that's interesting.

278
00:23:26,539 --> 00:23:31,159
I mean, the access to the broad range of data was trained on knowledge-based

279
00:23:31,159 --> 00:23:33,039
queries. It's sometimes quite interesting.

280
00:23:33,859 --> 00:23:38,039
I think I want to, if I can respond, on the de-skilling parts.

281
00:23:38,259 --> 00:23:41,499
I used to be very worried about de-skilling.

282
00:23:42,370 --> 00:23:46,910
And probably I still should be, but I had an experience as well that made me

283
00:23:46,910 --> 00:23:48,190
a little bit less afraid of that.

284
00:23:48,330 --> 00:23:52,890
So in KDE, we have a mentor program. We have a website that lists a number of

285
00:23:52,890 --> 00:23:57,530
members of the community who you can contact on a one-on-one basis if you want

286
00:23:57,530 --> 00:24:01,570
to ask questions or you want some tutoring on how to do KDE development.

287
00:24:02,050 --> 00:24:08,750
And there was a young person who reached out to me and they wanted to be a KDE contributor.

288
00:24:08,750 --> 00:24:12,010
Contributor they had had some limited programming training

289
00:24:12,010 --> 00:24:20,210
but really they were starting out with c++ and this being 2022 i guess of course

290
00:24:20,210 --> 00:24:24,850
they were using chat gpt already because you know it's kind of hard to ignore

291
00:24:24,850 --> 00:24:29,530
if you are trying to climb a mountain like that so we made the agreement that.

292
00:24:30,690 --> 00:24:34,590
Yes he could use chat gpt for this but we would share an account and i would

293
00:24:34,590 --> 00:24:39,070
be able to read what he asks the AI so that I can steer him clear of going into

294
00:24:39,070 --> 00:24:40,390
a completely wrong direction.

295
00:24:40,650 --> 00:24:44,630
And it was pretty interesting to see him interact with the system.

296
00:24:46,230 --> 00:24:52,250
And of course, it was very hard for him to keep up the discipline to try and do things by himself.

297
00:24:52,450 --> 00:24:57,710
He would lean very heavily on code generation in the end, and it would become

298
00:24:57,710 --> 00:24:59,370
sort of a cycle of, well, this doesn't work.

299
00:24:59,450 --> 00:25:02,470
So I go back to the model immediately and I ask it to fix it up or make changes.

300
00:25:02,470 --> 00:25:06,250
And it falls apart pretty quickly there.

301
00:25:06,310 --> 00:25:08,530
You can ask it to generate something once, that sort of works,

302
00:25:08,630 --> 00:25:12,450
but integrating a hundred interactions over time doesn't yield a useful program.

303
00:25:12,670 --> 00:25:18,210
And he started recognizing at some point that he would run into a type of vault

304
00:25:18,210 --> 00:25:23,250
that he just couldn't leap over anymore because he didn't end up understanding

305
00:25:23,250 --> 00:25:26,730
the code file that he had produced in that patchwork fashion.

306
00:25:26,970 --> 00:25:32,190
And he came by himself to the conclusion that whenever I do it manually and

307
00:25:32,190 --> 00:25:36,790
I fight that three-hour battle to produce one line of code, at the end of that,

308
00:25:36,830 --> 00:25:40,250
I actually understand what it does and why it is fashioned that way.

309
00:25:40,490 --> 00:25:44,190
And after about half a year, he was like, this is not mentally healthy for me.

310
00:25:44,210 --> 00:25:46,490
I will stop using this crutch. So I think.

311
00:25:47,717 --> 00:25:52,957
I'm an optimistic person and I think also around this technology society will

312
00:25:52,957 --> 00:25:58,937
come to a point where sort of the use it or lose it of it all is apparent.

313
00:25:59,217 --> 00:26:04,057
It's just, you know, using stairs instead of an elevator occasionally or things like that.

314
00:26:04,337 --> 00:26:07,757
And particularly with writing, I agree with you.

315
00:26:07,837 --> 00:26:12,017
I mean, I even turn off the word completion on my phone keyboard,

316
00:26:12,077 --> 00:26:14,937
which this is a sophisticated version of, right?

317
00:26:14,937 --> 00:26:18,197
Because or spell checking because I've always

318
00:26:18,197 --> 00:26:21,537
felt like my orthography will just suffer if I

319
00:26:21,537 --> 00:26:25,417
don't learn how to spell myself and I think a lot of people come to the same

320
00:26:25,417 --> 00:26:29,417
conclusion so I'm not that worried about it I think we'll learn how to keep

321
00:26:29,417 --> 00:26:33,977
it under control of course I mean in in a company I guess you would want to

322
00:26:33,977 --> 00:26:38,857
measure whether this is an efficient way yeah letting people run into walls,

323
00:26:39,517 --> 00:26:41,517
and take half a year to do so,

324
00:26:42,637 --> 00:26:44,717
there might be more efficient ways of training them up.

325
00:26:44,777 --> 00:26:48,937
Yeah, but I mean, particularly if you're an employer, I think you should also

326
00:26:48,937 --> 00:26:53,057
learn to recognize the value of investing in your people.

327
00:26:53,237 --> 00:26:58,957
And I mean, obviously training is a thing. And the absence of using AI tools

328
00:26:58,957 --> 00:27:03,177
may constitute training that gives you a longer lasting value over a person

329
00:27:03,177 --> 00:27:08,197
if you end up shaping their intellect rather than turning them into a prompt engineer,

330
00:27:08,197 --> 00:27:15,437
for example yeah i hope so at least and it brings us to some of the.

331
00:27:17,417 --> 00:27:20,637
Scenarios that the world is talking about right anything from,

332
00:27:22,437 --> 00:27:27,317
the worst the world's going to end to this is the magical thing that's going

333
00:27:27,317 --> 00:27:32,597
to solve all of our problems tomorrow um and let's start maybe a bit more with

334
00:27:32,597 --> 00:27:36,237
the positive side of that or the more optimistic side of that.

335
00:27:37,197 --> 00:27:41,997
What gets you excited right now and hopeful about current AI development?

336
00:27:45,297 --> 00:27:51,337
Okay, I go first. I mean, for me this is a fascinating, super exciting time.

337
00:27:51,817 --> 00:27:56,017
Obviously, there are things possible in my field that just weren't possible

338
00:27:56,017 --> 00:28:02,397
five years ago. And there's a huge open vista of things to explore.

339
00:28:03,537 --> 00:28:05,697
So that obviously is exciting.

340
00:28:08,297 --> 00:28:12,997
It's also on the sort of more applied side, it will also be super interesting

341
00:28:12,997 --> 00:28:15,697
to see how people turn this into actual products.

342
00:28:16,849 --> 00:28:21,869
And what I always say is these models, and again, this is difficult to understand

343
00:28:21,869 --> 00:28:28,909
if you don't know these models more closely, they are kind of doomed to be generalists, right?

344
00:28:28,989 --> 00:28:36,629
Because they are, after all, on some level of description, they are autocomplete on steroids, right?

345
00:28:37,489 --> 00:28:40,409
And so they will complete anything and everything you give them.

346
00:28:40,469 --> 00:28:43,029
You give them a prompt and they complete it in some way.

347
00:28:43,849 --> 00:28:47,789
And if that's not what you want in your application, because you only want to

348
00:28:47,789 --> 00:28:52,269
produce a summary or you only want to do something else, you have a hard task

349
00:28:52,269 --> 00:28:57,329
of designing out this generality again.

350
00:28:57,529 --> 00:29:01,309
But I think the way forward to actual products is to do exactly that.

351
00:29:01,469 --> 00:29:06,429
And it's what I'm trying to get at with my metaphor of them being function approximators.

352
00:29:07,369 --> 00:29:11,149
So I think also on On the applied side, there's a lot of super interesting work,

353
00:29:11,309 --> 00:29:13,989
especially in the user interface space.

354
00:29:14,489 --> 00:29:23,949
I don't think the free chatbot with a weird, quirky personality type interface

355
00:29:23,949 --> 00:29:26,429
is the final form of this technology.

356
00:29:26,789 --> 00:29:30,789
This is just the first thing that people came up with because it's just so natural.

357
00:29:31,109 --> 00:29:34,189
You can train this behavior in very easily.

358
00:29:35,109 --> 00:29:37,109
But to be really useful...

359
00:29:38,489 --> 00:29:44,169
In sort of standard, normal, accepted ways where you measure the usefulness

360
00:29:44,169 --> 00:29:51,769
other than, wow, this is a funny-sounding limerick or I'm impressed by how cute

361
00:29:51,769 --> 00:29:52,649
this sounds or whatever.

362
00:29:53,309 --> 00:29:59,109
These are not good metrics for measuring actual utility in real context.

363
00:29:59,489 --> 00:30:05,589
So I think there's a lot to be done there, hopefully by some of you here.

364
00:30:10,493 --> 00:30:18,153
So I think what excites me about it comes down a little bit to how I sort of

365
00:30:18,153 --> 00:30:24,373
conceive of the role of what an engineer does in society or what we do broadly do at KDE.

366
00:30:24,813 --> 00:30:30,933
So when I was a young boy, I used to go to the Berlin Natural History Museum a lot.

367
00:30:30,973 --> 00:30:33,973
I requested to go there every birthday that I had.

368
00:30:33,973 --> 00:30:39,373
And the Natural History Museum has a wonderful expo on sort of stone age civilization

369
00:30:39,373 --> 00:30:41,393
and then how people used to live,

370
00:30:41,953 --> 00:30:49,273
many thousands of years ago and you get to see the tools that that society used, you know.

371
00:30:49,993 --> 00:30:54,433
Passion stones to make fire and so on and it occurred to me even as a kid that

372
00:30:54,433 --> 00:30:58,953
somebody had to think of making that stone and then produce it and give it to

373
00:30:58,953 --> 00:31:03,613
others and then they might use it to make fire and then they sit around that fire and

374
00:31:03,973 --> 00:31:07,133
i don't know start speaking and boom civilization

375
00:31:07,133 --> 00:31:14,853
and i think that is sort of um what i aspire to do in its modern incarnation

376
00:31:14,853 --> 00:31:19,913
to build tools that allow other people to live their life a little bit better

377
00:31:19,913 --> 00:31:26,953
and sort of spread civilization and culture and i think ai and large language models

378
00:31:27,053 --> 00:31:34,113
are such a heated topic because they promise to help us to make better tools,

379
00:31:34,293 --> 00:31:39,893
for example, allowing people to accomplish sophisticated tasks without specialized

380
00:31:39,893 --> 00:31:43,713
training, because they maybe bridge that natural language understanding gap,

381
00:31:43,853 --> 00:31:47,253
the kind of do what I mean instead of what I say part.

382
00:31:47,453 --> 00:31:51,153
But at the same time, they seem to, to some, to sort of encroach on that humanity.

383
00:31:51,253 --> 00:31:52,813
So rather than help, they supplant.

384
00:31:52,973 --> 00:31:55,713
And I think that's where a lot of the emotion comes from.

385
00:31:56,233 --> 00:32:01,153
What I liked about your presentation and your approach at all is to sort of

386
00:32:01,153 --> 00:32:05,913
try to turn that heat level of emotion down a bit and pare it down to what does

387
00:32:05,913 --> 00:32:09,793
it actually do, what is the mechanism here, what are the limitations of it.

388
00:32:10,033 --> 00:32:16,413
And I think for developers, it's very useful to think of it as sort of a pure

389
00:32:16,413 --> 00:32:21,253
function that accepts fuzzy parameters and you can't always trust the return value.

390
00:32:21,613 --> 00:32:25,973
I think that that's quite useful. but what that allows you to do is very powerful.

391
00:32:26,173 --> 00:32:31,973
I mean, what you can now accomplish as a developer, let's say building a speech

392
00:32:31,973 --> 00:32:36,913
system on a weekend as opposed to taking five years at Amazon with 300 people

393
00:32:36,913 --> 00:32:39,473
is pretty stunning. So that's exciting.

394
00:32:43,193 --> 00:32:48,373
Now we talked a bit about the positive and hopeful side, but there's also a

395
00:32:48,373 --> 00:32:51,813
lot that gives people pause. What is that for you?

396
00:32:53,624 --> 00:33:01,604
Right. Yeah, I mean, as I said, these are powerful tools, or at least they appear to be powerful tools.

397
00:33:01,824 --> 00:33:04,984
They are very confident themselves that they are powerful tools.

398
00:33:07,504 --> 00:33:14,784
And it gives people the temptation to build powerful applications and to shoot

399
00:33:14,784 --> 00:33:18,264
themselves very powerfully in the foot with them.

400
00:33:20,464 --> 00:33:25,764
And there's also a lot of money in the system. like a lot of money i i wanted

401
00:33:25,764 --> 00:33:27,964
to say that i didn't get to say this but a lot of these,

402
00:33:28,724 --> 00:33:31,544
these breakthroughs i have to admit this they're

403
00:33:31,544 --> 00:33:36,324
coming out of commercial labs they're coming out of industry labs not academic

404
00:33:36,324 --> 00:33:41,384
labs just because of the the insane capital expenditure that is needed to to

405
00:33:41,384 --> 00:33:46,244
build these things so there's a lot of venture capital in in there and at some

406
00:33:46,244 --> 00:33:49,224
point and some people want to see returns on it.

407
00:33:49,764 --> 00:33:55,604
So the temptation to build applications that are given more power than they

408
00:33:55,604 --> 00:34:00,824
should have, given the lack of guarantees, that is real.

409
00:34:01,004 --> 00:34:05,164
And that is the part where doom lies.

410
00:34:06,024 --> 00:34:12,544
It does not lie in chat GPT becoming conscious and trying to kill us all because

411
00:34:12,544 --> 00:34:15,104
we ask such inane questions all the time.

412
00:34:16,004 --> 00:34:21,504
It's people building applications that have a degree of autonomy that is not

413
00:34:21,504 --> 00:34:27,664
warranted and we know that complex systems are difficult and this can go off

414
00:34:27,664 --> 00:34:30,644
the rails quickly so that would be my worry.

415
00:34:33,180 --> 00:34:36,400
Yeah so i do think that around this technology we

416
00:34:36,400 --> 00:34:44,200
have um obviously a big problem with what you could call media literacy right

417
00:34:44,200 --> 00:34:48,800
as you pointed out users need to have a good model of how these systems actually

418
00:34:48,800 --> 00:34:54,300
work what their limitations are where your responsibilities as a user lie also.

419
00:34:56,420 --> 00:35:01,360
Perhaps whether it's actually good for you to use it or not and a lot of the

420
00:35:01,360 --> 00:35:06,000
commercial vendors in in this space are not incentivized, to be honest,

421
00:35:06,200 --> 00:35:08,160
about many of these things.

422
00:35:08,700 --> 00:35:13,400
Only to the extent that they are worried about liability will make them do it.

423
00:35:14,480 --> 00:35:19,440
I think for us, because we have a lot of freedom from those constraints,

424
00:35:19,700 --> 00:35:25,700
I mean, surely we also worry maybe about liability, but we get to worry about many more things.

425
00:35:25,700 --> 00:35:34,740
I think KDE broadly is in, our mission is to figure out what it means to make

426
00:35:34,740 --> 00:35:36,460
socially responsible software.

427
00:35:36,740 --> 00:35:41,020
And licensing for us is an important aspect of what socially responsible means.

428
00:35:41,120 --> 00:35:42,700
We have very strong ideas about that.

429
00:35:43,000 --> 00:35:50,360
And for example, we should also delve into what we think that means in the AI

430
00:35:50,360 --> 00:35:53,900
model space. place. But socially responsible also means sustainable,

431
00:35:54,060 --> 00:35:56,360
for example, or protecting privacy.

432
00:35:56,800 --> 00:36:02,320
And it can also mean giving users a realistic picture of how much AI is good

433
00:36:02,320 --> 00:36:07,420
for them and transparency over whether it's used or enabled or not and things like that.

434
00:36:07,720 --> 00:36:14,300
And I think that is what I expect of ourselves, that we should pay attention

435
00:36:14,300 --> 00:36:16,740
to this to avoid the doom perhaps.

436
00:36:19,352 --> 00:36:25,372
That brings us right back to KDE's role and maybe one last question before we

437
00:36:25,372 --> 00:36:26,332
go to audience questions.

438
00:36:27,912 --> 00:36:36,952
So are there areas in KDE where you see positive potential for using generative AI,

439
00:36:37,592 --> 00:36:43,552
in applications in the development in our community?

440
00:36:49,752 --> 00:36:55,212
So, yes, I mean, I have a laundry list of little like wish list items where

441
00:36:55,212 --> 00:37:00,112
I think even the current level of technology would enhance our applications.

442
00:37:01,152 --> 00:37:05,732
Maybe something that you would benefit from as an academic who has to read a lot of long PDFs.

443
00:37:05,732 --> 00:37:10,512
I would love our document viewer to be able to answer the question which page

444
00:37:10,512 --> 00:37:15,492
of that long PDF particular topic is being talked about rather than having to

445
00:37:15,492 --> 00:37:17,192
do a keyword search, which is very clumsy.

446
00:37:17,312 --> 00:37:24,532
So little things like that. But I think, again, it's not just the features.

447
00:37:24,712 --> 00:37:30,492
It's also the fact that we have an opportunity to do it differently from the other players.

448
00:37:30,652 --> 00:37:35,252
I mean, we have a goal, for example, to protect the user's privacy that makes

449
00:37:35,252 --> 00:37:36,612
us much more interested, I think,

450
00:37:36,692 --> 00:37:40,632
sort of in running these technologies locally rather than on a server.

451
00:37:40,712 --> 00:37:48,392
And the other players are not doing it that much. So I think there we can offer

452
00:37:48,392 --> 00:37:51,412
something substantially different to the user as well that I think a lot of

453
00:37:51,412 --> 00:37:54,912
users want and I'm hoping that we get to do that.

454
00:37:56,397 --> 00:38:02,117
Maybe one shot. So I think you have an opportunity since you don't have commercial

455
00:38:02,117 --> 00:38:08,877
pressures so much at all, you have an opportunity to also lead in what you're not doing.

456
00:38:09,897 --> 00:38:18,437
So I think a very lazy way of integrating AI or LLMs is to have a button that calls ChatGPT.

457
00:38:18,717 --> 00:38:26,697
And that obviously requires no thought at all and does not address any of those

458
00:38:26,697 --> 00:38:28,777
problems that I tried to highlight.

459
00:38:29,097 --> 00:38:32,757
So I think you have a way of doing things more clever than this.

460
00:38:33,577 --> 00:38:39,837
And as I said, my guess would be that trying to cut down on the generality and

461
00:38:39,837 --> 00:38:47,277
really investigating a use case and making sure that this one use case is actually implemented well,

462
00:38:48,097 --> 00:38:50,237
that's an opportunity, I think.

463
00:38:51,077 --> 00:38:54,337
I agree with you. I think one of

464
00:38:54,337 --> 00:38:59,317
the interesting things about KDE's software and its technology stack is that

465
00:38:59,317 --> 00:39:03,097
we've always been very driven to create sort of libraries and frameworks that

466
00:39:03,097 --> 00:39:09,857
are used all across our applications and that make them able to interact with

467
00:39:09,857 --> 00:39:11,377
each other to some degree.

468
00:39:11,597 --> 00:39:15,477
And you can query a lot of information about these applications as they're running.

469
00:39:15,477 --> 00:39:19,377
And I think there's a lot of opportunity there to find sort of these more surgical

470
00:39:19,377 --> 00:39:25,797
places where can you conceive of a semi-intelligent function that you could

471
00:39:25,797 --> 00:39:28,317
sort of lock in between it, it would do something useful.

472
00:39:28,497 --> 00:39:34,097
That is not just bring up text box and stuff like that, but that selects usefully

473
00:39:34,097 --> 00:39:36,657
among things it's exposed to, for example. and things like that.

474
00:39:37,357 --> 00:39:40,717
I hope that, and this is a really interesting developer problem,

475
00:39:40,997 --> 00:39:46,477
what would be a good API for developers to run AI jobs like that?

476
00:39:46,737 --> 00:39:49,917
And how can they do things together? That's very interesting.

477
00:39:52,004 --> 00:39:54,844
All right, thank you so much. Then let's take some audience questions.

478
00:39:55,624 --> 00:39:58,384
Who has questions? All right, let's start with you.

479
00:40:01,284 --> 00:40:06,344
Oh, is this working? Okay, so when it comes to large language models,

480
00:40:06,844 --> 00:40:11,904
this does not happen to all fields of the AI, but it does happen here.

481
00:40:12,864 --> 00:40:21,164
One of the most serious problems there is that it needs a lot of data.

482
00:40:21,164 --> 00:40:23,004
It needs a lot of computing power.

483
00:40:23,184 --> 00:40:30,384
And we, as free software developers, we really don't have the resources to purchase them.

484
00:40:30,924 --> 00:40:39,804
So, and also, we probably won't want to use a service that is proprietary and hosted by others.

485
00:40:40,024 --> 00:40:45,024
And, you know, although they might have some privacy policy,

486
00:40:45,164 --> 00:40:52,664
we might still not like it. We don't want to give our data to just some random third party.

487
00:40:53,364 --> 00:40:55,864
What do you think can solve this problem?

488
00:40:57,044 --> 00:41:04,284
Yeah, this is a real problem. These models only are as good as they are because

489
00:41:04,284 --> 00:41:05,964
they have ingested a lot of data.

490
00:41:07,004 --> 00:41:12,444
And it's not a linear curve in the capabilities, right?

491
00:41:12,504 --> 00:41:16,304
You just need a lot of data. It's not that if you use half as much data,

492
00:41:16,384 --> 00:41:21,924
it's half as good, there is a certain amount of data you absolutely need to

493
00:41:21,924 --> 00:41:23,704
get the basic capabilities off the ground.

494
00:41:25,444 --> 00:41:31,464
And that puts the training from scratch of these models out of the hands for

495
00:41:31,464 --> 00:41:33,524
the foreseeable future, out of the hands of...

496
00:41:36,259 --> 00:41:42,239
Of organizations without a lot of access access to a lot of capital um yeah

497
00:41:42,239 --> 00:41:48,039
there's a problem there are some um some organizations for example allen ai

498
00:41:48,039 --> 00:41:53,199
has just released the new version of the olmo model they are quite open about

499
00:41:53,199 --> 00:41:55,759
the data they're using and they've released all the

500
00:41:55,799 --> 00:42:06,699
checkpoints and and and so on um but there is of course llama by by meta which has a,

501
00:42:07,499 --> 00:42:12,799
slightly weird license but it might be possible for for you to to use it but

502
00:42:12,799 --> 00:42:17,359
then you have to live with the fact that this has been paid for by horrible

503
00:42:17,359 --> 00:42:22,599
uh deeds uh done by meta so yeah this This is a problem.

504
00:42:24,039 --> 00:42:33,919
I don't think training will ever become a possibility for normal people or academics,

505
00:42:34,099 --> 00:42:37,619
for not otherwise funded entities.

506
00:42:38,899 --> 00:42:44,859
If that rules it out completely for you, then that is a stance as well that needs to be discussed.

507
00:42:45,379 --> 00:42:51,759
But yeah, this is a deep problem at the root of it. So I think the provenance

508
00:42:51,759 --> 00:42:55,379
of the training data is for our community.

509
00:42:57,010 --> 00:43:00,410
Topic that we think about a lot obviously because um

510
00:43:00,410 --> 00:43:03,130
again licensing is a very important

511
00:43:03,130 --> 00:43:06,610
topic for us in free software you could say that creative use

512
00:43:06,610 --> 00:43:13,170
of the copyright system is uh one of the cornerstones of how we were going and

513
00:43:13,170 --> 00:43:21,530
therefore we feel that and from our hearts i think that that if you don't display

514
00:43:21,530 --> 00:43:25,130
integrity about protecting the rights of authors,

515
00:43:25,310 --> 00:43:30,790
then you're doing it from a free software person, since we rely on that mechanism so much ourselves.

516
00:43:31,390 --> 00:43:35,350
At the same time, obviously, a lot of us feel that we very intentionally release

517
00:43:35,350 --> 00:43:39,790
our code as open source because we want others to have access to it.

518
00:43:39,890 --> 00:43:44,390
So to some extent, we also care about there being a public domain.

519
00:43:44,630 --> 00:43:49,230
Our licensing doesn't correspond to the public domain, but we're adjacent to it.

520
00:43:49,430 --> 00:43:56,610
So, given all of that, I think what a lot of people in the free software community are looking for is,

521
00:43:56,710 --> 00:44:02,270
is there anywhere credible activity to produce a clean training data set with

522
00:44:02,270 --> 00:44:05,530
known provenance of the data that you can sort of use guilt-free?

523
00:44:06,070 --> 00:44:10,050
Presumably, as an academic, that's also something that you would have massive interest in.

524
00:44:10,050 --> 00:44:16,250
And I'm not, are you aware of any sort of solid effort to curate a sufficiently

525
00:44:16,250 --> 00:44:24,430
large training data set that produces a model that would not give us sort of doubt and worries?

526
00:44:25,010 --> 00:44:29,050
Yeah, as I said, there's this LNAI initiative, there's Olmo initiative.

527
00:44:29,770 --> 00:44:34,470
As far as I'm aware, they are documenting where the data is from and trying

528
00:44:34,470 --> 00:44:38,870
to make sure that at least under...

529
00:44:40,050 --> 00:44:42,610
Under generous interpretations of the OR.

530
00:44:43,910 --> 00:44:48,190
Anyway, it's still contested what copyright law says about training.

531
00:44:48,290 --> 00:44:51,330
But anyways, so they are at least transparent about the data.

532
00:44:51,450 --> 00:44:56,830
But this is only one part, right? I mean, having the data is great,

533
00:44:56,970 --> 00:44:59,630
and you have several terabytes of data, that's great.

534
00:45:01,050 --> 00:45:03,050
But then training a model on it

535
00:45:03,050 --> 00:45:09,550
that requires several million dollars in compute is the next step, right?

536
00:45:10,950 --> 00:45:17,070
So that might be a barrier as well. But starting from data, people are trying to address this.

537
00:45:17,290 --> 00:45:21,730
Yeah, but I think we're more fine with someone sort of donating the compute to us.

538
00:45:22,670 --> 00:45:27,070
But the data is, I think, bad. That gives us more thought.

539
00:45:27,310 --> 00:45:31,750
Maybe to answer your question or to do something with your question a little bit more.

540
00:45:33,390 --> 00:45:39,310
So you mentioned that you need a certain minimum amount of data to produce a

541
00:45:39,310 --> 00:45:45,950
model that is useful but as a sort of non-expert looking into the various releases

542
00:45:45,950 --> 00:45:48,270
that come out there certainly seem to be.

543
00:45:49,709 --> 00:45:55,529
A lot of people who are trying to build smaller models that maybe show much

544
00:45:55,529 --> 00:46:02,069
poorer performance and sort of knowledge queries but still have better reasoning

545
00:46:02,069 --> 00:46:03,949
performance than you would expect.

546
00:46:04,669 --> 00:46:09,489
From scaling the data down and when it comes to integrating a model into our

547
00:46:09,489 --> 00:46:13,069
software that can sort of act as a function and usefully select among things

548
00:46:13,069 --> 00:46:15,589
that it's sort of zero shot given as a prompt,

549
00:46:15,829 --> 00:46:21,189
maybe sort of a small enough model would actually be good enough for us because

550
00:46:21,189 --> 00:46:22,389
we're not looking for that text

551
00:46:22,389 --> 00:46:26,829
box chat interface or we don't aim to be a Wikitalia replacement, right?

552
00:46:26,909 --> 00:46:33,049
So how true is that notion that you as a model designer get to pick and choose

553
00:46:33,049 --> 00:46:37,369
a bit as in, yes, your training data set is smaller, but the model architecture

554
00:46:37,369 --> 00:46:40,569
still makes it useful as a function approximator?

555
00:46:40,729 --> 00:46:46,509
You have to be careful there. There are two different dimensions to it that

556
00:46:46,509 --> 00:46:52,649
are somewhat orthogonal or somewhat independent of each other.

557
00:46:52,829 --> 00:46:55,629
One is the size of the model in terms of parameters.

558
00:46:56,309 --> 00:47:02,389
And you can get smaller models that are quite okay, that have capabilities,

559
00:47:02,629 --> 00:47:04,569
but they still need a lot of data.

560
00:47:05,229 --> 00:47:10,729
And it turns out that over-training, the training with a lot more tokens than

561
00:47:10,729 --> 00:47:15,389
people initially thought would be needed, even on smaller models, creates better models.

562
00:47:15,689 --> 00:47:21,629
So these dimensions don't go... You still need a lot of data,

563
00:47:21,829 --> 00:47:24,029
even on a model with fewer parameters.

564
00:47:24,509 --> 00:47:26,809
I think we have two minutes, according to the sign.

565
00:47:28,941 --> 00:47:36,661
David, thanks for doing this. I would like to draw attention back to this slide.

566
00:47:37,221 --> 00:47:41,761
And I noticed that you talk about cultural sustainability, but you don't talk

567
00:47:41,761 --> 00:47:43,301
about environmental sustainability.

568
00:47:43,881 --> 00:47:47,921
Why is that? Is that sort of like, oh, yeah, we're going to have these...

569
00:47:48,721 --> 00:47:55,501
That's meant to be in cost externalization. That's meant to be in the environmental sustainability.

570
00:47:55,501 --> 00:47:58,681
Sustainability right okay either way you

571
00:47:58,681 --> 00:48:01,861
know that all this is wishful thinking right nothing of

572
00:48:01,861 --> 00:48:06,381
this is happening no there's any sign that this is going to happen doesn't this

573
00:48:06,381 --> 00:48:12,481
give you a kind of a pause of well i mean i didn't give you my negative vision

574
00:48:12,481 --> 00:48:18,981
for ai which is kind of which is kind of the inverse of this but um yeah but

575
00:48:18,981 --> 00:48:21,461
we have to fight for it i mean we have what What can we do?

576
00:48:22,541 --> 00:48:28,981
We can't just resign. We can identify areas where we would like to intervene.

577
00:48:30,801 --> 00:48:31,361
Yes.

578
00:48:33,361 --> 00:48:39,201
I'm here. That's my attempt at telling you to be aware of these issues.

579
00:48:39,441 --> 00:48:46,221
I talk to decision makers. I talk to politicians who have dollar signs,

580
00:48:46,341 --> 00:48:51,101
who have euro signs in their eyes when they talk about, think about AI,

581
00:48:51,361 --> 00:48:55,841
and I tell them, okay, so here are areas where you can really do something.

582
00:48:56,401 --> 00:49:00,061
Yeah, yeah, yeah, listen to their fingers, that is what's happening,

583
00:49:00,261 --> 00:49:03,201
but you are kind of enabling them, right?

584
00:49:05,701 --> 00:49:09,581
Yeah. I don't know, I...

585
00:49:12,821 --> 00:49:16,421
I mean, exactly, that's the thing. So I think the thought...

586
00:49:17,221 --> 00:49:21,561
Progress at any cost. No, no, no. What I mean is that I think,

587
00:49:21,601 --> 00:49:23,441
for example, now we're at KDE, right?

588
00:49:23,501 --> 00:49:30,161
I think the worst thing that we could do is ignore the topic because it is fraught with problems.

589
00:49:30,401 --> 00:49:37,561
And through a combination of hard work over 30 years and historical happenstance,

590
00:49:37,821 --> 00:49:41,661
we're one of a half dozen ways to use a computer, right?

591
00:49:41,661 --> 00:49:46,061
There aren't that many ways to interact with a PC. We're one of them.

592
00:49:46,781 --> 00:49:51,901
So I think that means we get to think about what stance we take and how to do

593
00:49:51,901 --> 00:49:55,881
it according to our value system and hopefully show people a better example.

594
00:49:56,281 --> 00:50:02,401
So I think you're very appropriate, right? We have a KDE Eco initiative.

595
00:50:02,641 --> 00:50:07,241
What does that mean in the AI context? And what does it mean to using AI and

596
00:50:07,241 --> 00:50:10,181
KDE software as a very good thing for us to think about?

597
00:50:10,181 --> 00:50:18,861
And we will continue I see the sign you know I'm ignoring we will continue this

598
00:50:18,861 --> 00:50:23,461
conversation in the hallway I'm sure we will thank you so much everyone and

599
00:50:23,461 --> 00:50:24,441
thank you for inviting us.