1 00:00:00,017 --> 00:00:03,457 It was quick. Excellent. Yes, I'm very happy to be here. 2 00:00:05,757 --> 00:00:10,537 So Lydia, yesterday at the dinner, asked me what I thought when I got the invitation 3 00:00:10,537 --> 00:00:13,257 to speak here. I'm hearing myself. 4 00:00:14,117 --> 00:00:21,117 Anyways. And the answer is I'm quite happy to do these things and to speak to 5 00:00:21,117 --> 00:00:26,417 people kind of outside of the research community about these large language 6 00:00:26,417 --> 00:00:28,617 models because I feel a bit of a responsibility. 7 00:00:28,617 --> 00:00:35,177 They are the product of exactly my research field, and there's a lot of justified confusion out there. 8 00:00:35,337 --> 00:00:40,337 So it is good to get into a conversation with people who might be like you, 9 00:00:40,377 --> 00:00:44,857 who might be shaping how these things are actually going to be used, if at all. 10 00:00:45,537 --> 00:00:50,557 So I have 10 minutes. So this is going to be very high level, obviously. 11 00:00:51,217 --> 00:00:58,697 And what I thought I'd do is, in the first part, I talk a bit about the history, about how we got here. 12 00:00:59,977 --> 00:01:03,117 And so in the field, we had a little bit of a heads up. 13 00:01:03,917 --> 00:01:08,217 We've been seeing this coming for a couple of years, but not a lot. 14 00:01:08,337 --> 00:01:12,177 So a lot of it was surprising for us as well. 15 00:01:12,397 --> 00:01:18,637 And in the second part, I will talk a bit about our current understanding of 16 00:01:18,637 --> 00:01:23,677 LLMs and maybe a positive path forward from it. 17 00:01:25,117 --> 00:01:29,777 Okay, so here are the past three and a half decades. 18 00:01:32,042 --> 00:01:36,402 The history of the field is longer, but this happens to be the time that I was 19 00:01:36,402 --> 00:01:42,242 involved in this field, first as a student and then as a researcher in various capacities. 20 00:01:43,442 --> 00:01:50,022 And what's interesting about this is that I have witnessed already two paradigm 21 00:01:50,022 --> 00:01:55,162 shifts, two very different ways of doing things, which brought us here. 22 00:01:55,222 --> 00:02:01,722 So in the 1990s, This was very much about symbolic methods and knowledge representation. 23 00:02:02,082 --> 00:02:06,762 So what we would do is to process language with computers, we would go to our 24 00:02:06,762 --> 00:02:13,322 linguist friends, get grammars from them, and then write programs that process these grammars. 25 00:02:13,362 --> 00:02:19,262 And these are programs that are very much recognizable to you as well because 26 00:02:19,262 --> 00:02:24,502 they are fully thought through these programs, hopefully at least, ideally at least. 27 00:02:25,182 --> 00:02:27,822 I did around that time in the 28 00:02:27,822 --> 00:02:31,322 late 1990s I did train my first neural network I took 29 00:02:31,322 --> 00:02:34,262 a class on neural networks the University of Bonn where 30 00:02:34,262 --> 00:02:37,602 I studied computer science at the time so 31 00:02:37,602 --> 00:02:44,462 and I also trained my first language model in the late 1990s I think at an exchange 32 00:02:44,462 --> 00:02:50,142 semester in Edinburgh so these are not new techniques right these are fairly 33 00:02:50,142 --> 00:02:54,342 old techniques but they had a very They were very niche at the time. 34 00:02:54,362 --> 00:02:56,302 They had a very specialized use. 35 00:02:57,082 --> 00:03:00,722 I would love to be able to tell you that I stuck with these methods, 36 00:03:00,802 --> 00:03:06,342 like the fathers of deep learning, and reaped the benefit 20 years later. 37 00:03:06,502 --> 00:03:11,642 But I did not. I worked with mainstream methods of the field, 38 00:03:11,722 --> 00:03:13,762 which were symbolic methods at first. 39 00:03:14,242 --> 00:03:20,302 And during the 2000s, we started to use statistical methods or machine learning 40 00:03:20,302 --> 00:03:24,522 methods. methods that basically learn parts of the. 41 00:03:26,632 --> 00:03:31,032 Of the module, of the processing module. But the point is that there was still 42 00:03:31,032 --> 00:03:33,112 symbolic knowledge representations. 43 00:03:33,852 --> 00:03:38,872 The way that you got these representations, it helped you to process language, 44 00:03:38,932 --> 00:03:39,852 to understand language. 45 00:03:40,072 --> 00:03:47,832 It was machine learned, but the representations themselves were designed by humans to be clever. 46 00:03:49,272 --> 00:03:59,072 I myself got an induction in the latest paradigm in 2014 when I did a sabbatical 47 00:03:59,072 --> 00:04:01,732 at Microsoft Research in Seattle. 48 00:04:02,612 --> 00:04:08,052 And it was crazy. There was really a buzz in the air and people were super stoked 49 00:04:08,052 --> 00:04:11,012 by the first neural methods that started to work. 50 00:04:11,312 --> 00:04:17,252 And at that time, that was word embeddings. So there's a particular way of representing 51 00:04:17,252 --> 00:04:21,112 the semantics, the meaning of words. 52 00:04:21,992 --> 00:04:27,652 But what was new about it was that the representations weren't designed. 53 00:04:27,892 --> 00:04:30,732 The representations were machine-learned as well. 54 00:04:31,072 --> 00:04:36,432 The machine basically came up with the best representation that was best for the task. 55 00:04:36,552 --> 00:04:39,932 For a very simple task, but it turned out to be very general. 56 00:04:39,932 --> 00:04:45,732 And that kind of set the mood, set the scene for where we are now. 57 00:04:45,992 --> 00:04:50,092 So the first ingredient that changed many things was representation learning. 58 00:04:51,552 --> 00:04:57,972 The representations are built by the machine itself. And then with that comes end-to-end systems. 59 00:04:58,712 --> 00:05:03,792 So instead of building modular computer systems, what you do is you really train 60 00:05:03,792 --> 00:05:07,532 for the task and let the machine do all the intermediate steps itself. 61 00:05:09,932 --> 00:05:14,072 And then in 2018, the BERT paper came out. 62 00:05:14,752 --> 00:05:18,672 Just to get an idea of how many people know something about this. 63 00:05:18,732 --> 00:05:19,752 Does anyone know this paper? 64 00:05:20,352 --> 00:05:25,392 Yeah, a good few people know this paper. Okay, so this was the first paper that 65 00:05:25,392 --> 00:05:27,492 really showed that transformers would work. 66 00:05:28,652 --> 00:05:34,032 So this was the transformers paper was before that. But this was the first paper 67 00:05:34,032 --> 00:05:36,192 that applied it and got fantastic results. 68 00:05:37,432 --> 00:05:41,192 I was at a conference. This paper came out in 2018, late 2018. 69 00:05:41,532 --> 00:05:47,392 I was at a conference a couple of weeks later, and people were kind of stunned. 70 00:05:47,712 --> 00:05:50,712 People were giving their presentations as they planned. 71 00:05:50,912 --> 00:05:55,612 We were showing their results, and then they said, okay, when we wrote this 72 00:05:55,612 --> 00:05:58,632 paper, these were the state-of-the-art results. 73 00:05:58,872 --> 00:06:03,852 We tried this new bird model, and we got that this bird model is 10% better. 74 00:06:04,052 --> 00:06:08,772 It blows our model out of the water. So that was really, that was a moment there. 75 00:06:09,472 --> 00:06:15,732 And by that time, then the race was on to develop, to do something with this. 76 00:06:17,126 --> 00:06:25,306 For various reasons, as we all know, OpenAI was the team that produced the first, 77 00:06:26,366 --> 00:06:32,266 model or product maybe that caught people's attention outside of the field. 78 00:06:32,986 --> 00:06:37,486 Many other people tried as well. It's a little bit of a puzzle why Google didn't get there. 79 00:06:37,606 --> 00:06:44,566 But anyway, Anyway, so that's where we are now and as on the left-hand side, 80 00:06:44,766 --> 00:06:47,286 I subsumed this as generalist models. 81 00:06:47,926 --> 00:06:52,906 So the really new thing is now that we have models that aren't even trained 82 00:06:52,906 --> 00:06:54,986 for one particular task. 83 00:06:55,126 --> 00:06:58,726 Well, they are trained for one particular task, namely next-word prediction, 84 00:06:59,006 --> 00:07:04,606 but it turns out that this is super powerful that you can turn them into generalist models. 85 00:07:07,926 --> 00:07:12,506 And this really is a new situation, so there has been a paradigm shift and there 86 00:07:12,506 --> 00:07:16,306 has been kind of, okay, what do we do now? 87 00:07:16,426 --> 00:07:18,766 Where do we go now with this field? 88 00:07:18,906 --> 00:07:25,046 And finally, how can we build applications with this? How can we make this useful for people? 89 00:07:27,866 --> 00:07:34,766 Okay, so this is my part one. I'm not really sure what the moral is of this, 90 00:07:35,926 --> 00:07:41,546 because, unfortunately, these methods are a little bit stupid. 91 00:07:42,286 --> 00:07:46,786 It is a little bit insulting, actually, that these very simple methods, 92 00:07:46,926 --> 00:07:51,566 when scaled up enormously, work that well. 93 00:07:52,126 --> 00:07:55,966 But that's where we are, and now we need to do something with this situation. 94 00:07:57,486 --> 00:08:02,726 Okay, so in my research group, we work on language use a lot. 95 00:08:02,726 --> 00:08:08,266 We think about the pragmatics of language, about the subtle cues that change 96 00:08:08,266 --> 00:08:10,526 the meaning and contextualize the meaning. 97 00:08:10,986 --> 00:08:16,426 And we have a lot of projects now trying to understand to what degree LLMs can model this. 98 00:08:16,606 --> 00:08:21,006 I want to give a shout out to one project we're doing. 99 00:08:21,006 --> 00:08:28,546 We have a benchmarking tool for LLMs that tests pragmatic abilities by letting 100 00:08:28,546 --> 00:08:33,466 them play dialogue, conversational games kind of against themselves. 101 00:08:33,826 --> 00:08:39,046 But that's not what I want to talk about in the remaining whatever many minutes. 102 00:08:41,466 --> 00:08:47,426 We are also doing thinking about the role of LLMs that they can and cannot play 103 00:08:47,426 --> 00:08:52,646 in technology. And these are two papers that I want to very briefly touch on 104 00:08:52,646 --> 00:08:55,286 on the remaining slides. 105 00:08:57,267 --> 00:09:03,767 Okay, so here are three observations that we maybe can discuss later and do something with. 106 00:09:04,927 --> 00:09:09,087 The first one is that LLMs have decoupled fluency from expertise. 107 00:09:09,527 --> 00:09:11,387 And I guess you guys all know this. 108 00:09:12,167 --> 00:09:16,767 But it is surprising how many people sort of in the general population still 109 00:09:16,767 --> 00:09:19,047 have not fully understood this, right? 110 00:09:19,187 --> 00:09:24,527 Because it used to be the case that if someone produces flawless text on a subject, 111 00:09:24,527 --> 00:09:29,147 It used to be the case that there are also experts in the subject matter, 112 00:09:29,367 --> 00:09:32,987 but this is very much decoupled for technical reasons. 113 00:09:33,347 --> 00:09:38,447 The training objective of these models rewards them for producing the right 114 00:09:38,447 --> 00:09:44,127 words and the right phrases, but it has no access to the underlying causes for 115 00:09:44,127 --> 00:09:46,507 why you would want to produce particular words. 116 00:09:46,507 --> 00:09:53,167 So it gets the tone very right of almost anything you ask it to do. 117 00:09:53,907 --> 00:09:57,047 But there are no guarantees on whether it will get the content right. 118 00:09:57,667 --> 00:10:00,107 And that is a bit of a problem, perhaps. 119 00:10:01,647 --> 00:10:07,907 Okay, second point. Computers cannot assert, even if the accuracy were a lot 120 00:10:07,907 --> 00:10:12,587 higher and hallucinations were very aware, well, they are still fundamentally 121 00:10:12,587 --> 00:10:16,527 different from human speakers because they don't care. 122 00:10:16,927 --> 00:10:20,027 There's no one there who could care about what is said, right? 123 00:10:20,707 --> 00:10:25,127 And my conclusion, at least from this, is that humans need to retain semantic 124 00:10:25,127 --> 00:10:26,427 control over the output. 125 00:10:26,947 --> 00:10:32,567 Basically, you need to know, you need to be able to understand whatever you're producing with LLMs. 126 00:10:33,247 --> 00:10:38,887 And that restricts the use cases quite a bit, I think. But I think that's non-negotiable 127 00:10:38,887 --> 00:10:39,627 with these technologies. 128 00:10:40,387 --> 00:10:45,307 And the third point is that understanding LLMs as language-controllable function 129 00:10:45,307 --> 00:10:49,967 approximators yields an understanding with which we can do something. 130 00:10:50,927 --> 00:10:56,027 So the idea here is that instead of you writing the code for a function, 131 00:10:56,267 --> 00:11:01,827 you search for this function in the LLM, the latent space of the LLM, 132 00:11:01,887 --> 00:11:06,927 and then you treat the LLM indeed as having implemented this function. 133 00:11:06,927 --> 00:11:10,067 But it's only approximating this function, right? 134 00:11:10,367 --> 00:11:13,007 And that interesting challenges come from this. 135 00:11:14,307 --> 00:11:20,107 Okay, I'm already over time. I'm gonna skip over a lot, but I want to end a 136 00:11:20,107 --> 00:11:26,147 little bit on what I think is a positive way forward for using AI technologies. 137 00:11:26,867 --> 00:11:29,567 And maybe we can discuss this in the panel. 138 00:11:30,527 --> 00:11:35,427 Okay, so in a positive future, In the future, what has happened is that users 139 00:11:35,427 --> 00:11:41,807 and buyers of this technology have developed a good mental model of this technology. 140 00:11:43,539 --> 00:11:47,179 And have avoided all the anthropomorphization that is out there. 141 00:11:47,359 --> 00:11:51,739 So these things are not interns, they're not coworkers, they're not friends. 142 00:11:52,199 --> 00:11:53,739 They're programs, right? 143 00:11:54,519 --> 00:11:58,939 They're programs that have to work and can be evaluated as such. 144 00:11:59,079 --> 00:12:02,059 And if they don't work, you don't use them or you fix them, right? 145 00:12:03,419 --> 00:12:09,579 And the onus here is, I guess, on us as educators, really driving home this 146 00:12:09,579 --> 00:12:12,879 message to people that this is what these things are. 147 00:12:13,939 --> 00:12:20,479 De-skilling has been avoided, right? So there is a danger when you sort of naively 148 00:12:20,479 --> 00:12:25,819 build these techniques into products that you are de-skilling people because 149 00:12:25,819 --> 00:12:31,739 they rely on these products too much and forget how to do the tasks themselves. 150 00:12:33,539 --> 00:12:37,259 And to avoid this, some of this is on you. 151 00:12:37,799 --> 00:12:41,799 I think you can, when you implement these systems, you can make engineering 152 00:12:41,799 --> 00:12:43,819 design decisions that avoid this. 153 00:12:44,419 --> 00:12:48,239 Some of this is on us, it's on the research side. 154 00:12:48,659 --> 00:12:53,859 We need to do more research on interpretability and some is on the regulation side. 155 00:12:55,519 --> 00:12:59,019 Okay, another point, responsibility of loading has been avoided. 156 00:13:00,279 --> 00:13:04,279 So the tools are designed in such a way that people actually accept the fact 157 00:13:04,279 --> 00:13:05,979 that they have to have semantic control. 158 00:13:06,579 --> 00:13:13,059 Again, there's a design component to this. Maybe you have to design friction into the tools, right? 159 00:13:13,899 --> 00:13:20,019 So before sending off the email that you had the LLM formulate for you, 160 00:13:21,039 --> 00:13:25,039 get some proof from the user that they've actually read what has been produced. 161 00:13:25,379 --> 00:13:30,099 Something like this. This might be a naive idea, but this is for you to think about. 162 00:13:31,399 --> 00:13:35,099 Okay, what else do I have? Cost externalization has been reverted. 163 00:13:36,499 --> 00:13:39,919 These things need to be as expensive as they are. 164 00:13:41,679 --> 00:13:45,819 We know this from many areas. This holds for traffic as well. 165 00:13:45,819 --> 00:13:47,259 This holds for manufacturing. 166 00:13:47,699 --> 00:13:50,939 This holds for these things as well. They have to have the real costs, 167 00:13:51,059 --> 00:13:55,939 which might drive some use cases out of being economically viable. 168 00:13:56,239 --> 00:13:58,139 But then so be it, right? 169 00:13:59,039 --> 00:14:02,759 And the last point maybe is something that interests you as well. 170 00:14:02,759 --> 00:14:06,319 Well, some form of cultural sustainability has to be achieved. 171 00:14:07,159 --> 00:14:13,959 So maybe we can come up with models where, so one part might be licensing training 172 00:14:13,959 --> 00:14:21,279 data and paying the creators of the data that these models were trained on in that way. 173 00:14:21,439 --> 00:14:28,639 But there might be room for creative models, like where each instance where 174 00:14:28,639 --> 00:14:34,159 something is generated that can very clearly be traced back to source material, 175 00:14:34,499 --> 00:14:37,879 the creator of the source material is paid in some way. 176 00:14:39,803 --> 00:14:44,143 This is unlike, obviously, a real world. I mean, normally, if a teacher teaches 177 00:14:44,143 --> 00:14:49,783 you something, it's not the textbook author who gets money, the author of the 178 00:14:49,783 --> 00:14:51,963 textbook that the teacher learned this from. 179 00:14:52,123 --> 00:14:55,303 But these are new things, so new models should be tried out. 180 00:14:55,583 --> 00:15:03,003 Okay, with this, I think we can go over to the panel discussion. Okay, thank you. 181 00:15:13,523 --> 00:15:14,383 Thank you so much. 182 00:15:18,183 --> 00:15:20,183 I am going to leave it on this. Yes, coming. 183 00:15:23,123 --> 00:15:28,483 We are going to have a bit of a discussion here. 184 00:15:29,063 --> 00:15:32,443 And then also have time for audience questions. 185 00:15:33,363 --> 00:15:40,503 And my first question for both of you is a friend of mine a while ago said. 186 00:15:42,763 --> 00:15:50,803 AI, thank you AI is always that which we don't quite understand right now and 187 00:15:50,803 --> 00:15:55,303 what was AI 10 years ago we today maybe no longer consider AI, 188 00:15:56,063 --> 00:16:01,103 and I would love to hear from you, what do you consider to be AI today day and 189 00:16:01,103 --> 00:16:03,423 what do you maybe not consider to be AI? 190 00:16:05,183 --> 00:16:05,663 Anyone? 191 00:16:07,503 --> 00:16:08,343 Why don't you stop? 192 00:16:10,285 --> 00:16:14,265 Question uh yeah so um 193 00:16:14,265 --> 00:16:17,125 i i got the question list in advance so 194 00:16:17,125 --> 00:16:24,405 i had time to think about this a little bit um i i know a variant of this um 195 00:16:24,405 --> 00:16:30,725 and it's called the uh ai paradox uh i guess and that is kind of more about 196 00:16:30,725 --> 00:16:36,505 uh goalpost shifting right so so there the complaint is This is that people, 197 00:16:36,645 --> 00:16:40,585 as soon as it starts working, people say this is not real intelligence. 198 00:16:42,025 --> 00:16:49,085 And I think I have little problems with this because I don't think human value 199 00:16:49,085 --> 00:16:51,245 derives from intelligence only. 200 00:16:51,385 --> 00:16:58,845 So I don't feel threatened by programs doing tasks that appear intelligent. 201 00:17:00,165 --> 00:17:02,885 So in that sense, it doesn't quite work for me. 202 00:17:05,245 --> 00:17:08,545 And in a more technical sense, AI is a programming technique for me. 203 00:17:08,625 --> 00:17:13,145 So what has been programmed in this particular way stays that way. 204 00:17:13,305 --> 00:17:17,825 On the contrary, I'm actually still blown away. I'm old enough to remember how 205 00:17:17,825 --> 00:17:23,165 terrible ASR was and I'm still blown away by how good it works these days. 206 00:17:23,805 --> 00:17:30,225 So in that sense, there's still magic for me. Yeah, that's the reply, I guess. 207 00:17:31,345 --> 00:17:36,625 All right, maybe I introduce myself first, because I don't think we've done 208 00:17:36,625 --> 00:17:39,685 that yet, and maybe there are some who are online who don't know. 209 00:17:40,045 --> 00:17:44,765 I'm Eike, I've been a KDE developer for about 20 years on many of our applications 210 00:17:44,765 --> 00:17:52,545 and also on the Plasma user interface, so I think intelligent is a very common. 211 00:17:54,081 --> 00:18:01,941 But when I go look for guidance, what it means to me, I end up thinking about 212 00:18:01,941 --> 00:18:03,701 what our mission is at KDE. 213 00:18:03,841 --> 00:18:07,501 And our mission is chiefly to develop end-user software. 214 00:18:07,701 --> 00:18:12,481 So what does intelligent mean to a user? What makes a user want to describe 215 00:18:12,481 --> 00:18:16,281 a computer system that they interact with as intelligent? 216 00:18:16,441 --> 00:18:24,761 And has that recently changed? And I think it has because particularly large language models, 217 00:18:25,021 --> 00:18:32,641 I think, have given computer systems in a broad way that new ability to do what 218 00:18:32,641 --> 00:18:34,401 you mean rather than what you say, 219 00:18:34,561 --> 00:18:41,001 which is not what users are used to from a computer outside of maybe some sort 220 00:18:41,001 --> 00:18:41,721 of pattern recognition, 221 00:18:41,981 --> 00:18:44,721 habit learning and making suggestions like that. 222 00:18:44,721 --> 00:18:51,081 But now you can talk to your home and say, I want my room to look like the Barbie 223 00:18:51,081 --> 00:18:53,101 movie and it turns the lights pink. 224 00:18:53,401 --> 00:18:58,881 And that's something that it just couldn't do before. And it does this in a 225 00:18:58,881 --> 00:18:59,861 very generalist fashion. 226 00:18:59,981 --> 00:19:03,541 So statistical inference, it doesn't necessarily model you. 227 00:19:03,721 --> 00:19:06,281 So it doesn't understand what your needs are individually. 228 00:19:06,561 --> 00:19:08,821 And that's where some of the sort of problems lurk. 229 00:19:09,341 --> 00:19:16,561 But I think it has raised the bar for what users expect computers to be able to do. 230 00:19:16,961 --> 00:19:23,181 And I think that is something that we definitely have to deal with. 231 00:19:23,321 --> 00:19:28,561 But also we have to, I think, communicate very clearly when we cannot actually 232 00:19:28,561 --> 00:19:31,141 match these expectations and then where to draw the line. 233 00:19:32,801 --> 00:19:36,401 That's a very good point. And I love your Barbie example. example which brings 234 00:19:36,401 --> 00:19:43,601 me to my next question where do you use AI in all the shapes and forms in your daily life right now. 235 00:19:45,502 --> 00:19:49,442 Do you turn your room, Bobby Payne? I do on occasion. 236 00:19:49,962 --> 00:20:00,382 And I think, I mean, for me, and maybe I'm weird, but I love using these models to amuse myself. 237 00:20:00,862 --> 00:20:08,442 So I often generate a funny picture just for my own purposes or to share with 238 00:20:08,442 --> 00:20:12,542 someone or a funny humorous text because it does it faster than I can do it. 239 00:20:12,642 --> 00:20:18,062 I can give it somewhat clear requirements. And, I mean, you can play with the humor of it a lot. 240 00:20:18,562 --> 00:20:23,082 ChatGPT has such a peculiar writing style that is immediately recognizable to 241 00:20:23,082 --> 00:20:26,002 anyone at this point. And you can play around with that. 242 00:20:26,722 --> 00:20:33,322 I also, and this is interesting, right? So I do use it as an accelerator when I do development work. 243 00:20:33,602 --> 00:20:40,222 And it turns out to be most useful when I use it for an application that I'm already an expert in. 244 00:20:40,222 --> 00:20:43,802 Because then I can give it very specific requirements, but I can also check 245 00:20:43,802 --> 00:20:49,002 whether it actually did what I wanted it to do. Like I can tell if this code is bad or not. 246 00:20:49,342 --> 00:20:56,942 And the usefulness sort of tends to break down as soon as I venture into a space 247 00:20:56,942 --> 00:21:01,382 where I cannot verify whether what I got is actually useful or not. 248 00:21:01,622 --> 00:21:03,782 So I think that's an interesting observation too. 249 00:21:05,982 --> 00:21:13,102 Yeah i i guess ai in your question means llms right because i guess we all use ai, 250 00:21:14,102 --> 00:21:16,962 in very many instances like when 251 00:21:16,962 --> 00:21:20,542 we turn on the dishwasher right um and 252 00:21:20,542 --> 00:21:23,762 i mean this this kind of goes back to your first question um they 253 00:21:23,762 --> 00:21:27,282 are still um there's still machine learning uh modules 254 00:21:27,282 --> 00:21:33,222 in there but for llms um yeah that's interesting i um i mean as i said this 255 00:21:33,222 --> 00:21:37,262 just comes out of my field so i feel the responsibility to try this out a little 256 00:21:37,262 --> 00:21:44,442 bit but i don't use it much for many and maybe. 257 00:21:46,799 --> 00:21:51,939 Maybe this is connected to the de-skilling. I've been in this job for a while 258 00:21:51,939 --> 00:21:55,279 now, and many of the things that I'm doing, I've been doing for a very long time. 259 00:21:55,599 --> 00:22:02,519 So I think I'm still a lot better at generating chunks of text than an LLM is, 260 00:22:02,679 --> 00:22:07,039 and it would slow me down if I had to correct the text that comes out of it. 261 00:22:07,219 --> 00:22:12,779 I have found one good use case for me, which is, so I like coding, 262 00:22:12,919 --> 00:22:17,419 but I get to do it a lot less often than you guys, I guess. 263 00:22:19,039 --> 00:22:24,799 So with a space of months in between, so when after a couple of months I get 264 00:22:24,799 --> 00:22:33,119 to look at code again in Emacs or whatever, I have forgotten the most embarrassing things. 265 00:22:33,339 --> 00:22:38,279 I've forgotten how to talk to the syntax of a for loop in Python or something. 266 00:22:38,519 --> 00:22:43,299 And it's quite good for this. So I ask models to solve my problem. 267 00:22:43,459 --> 00:22:47,579 I look at the code. I remember what, code looks like, I remember the syntax, 268 00:22:47,679 --> 00:22:48,679 and then I write it myself. 269 00:22:49,119 --> 00:22:53,939 I completely disregard the solutions. But as this kind of prompt, 270 00:22:54,139 --> 00:22:57,599 as memory aid, I guess it works quite well. 271 00:22:59,399 --> 00:23:05,519 But I'm more worried about what it does to people who are new at tasks rather 272 00:23:05,519 --> 00:23:08,399 than people who have been doing tasks for a very long time. 273 00:23:08,579 --> 00:23:14,339 I think you're onto something there. I think it's useful as a sort of checks 274 00:23:14,339 --> 00:23:15,979 and balances sometimes, and the references. 275 00:23:16,019 --> 00:23:18,319 And you write the code yourself, you also have it generated, 276 00:23:18,359 --> 00:23:21,599 you compare the two, you check whether you've possibly missed anything, 277 00:23:21,799 --> 00:23:26,219 or whether the alternative approach might be different. So I think that's interesting. 278 00:23:26,539 --> 00:23:31,159 I mean, the access to the broad range of data was trained on knowledge-based 279 00:23:31,159 --> 00:23:33,039 queries. It's sometimes quite interesting. 280 00:23:33,859 --> 00:23:38,039 I think I want to, if I can respond, on the de-skilling parts. 281 00:23:38,259 --> 00:23:41,499 I used to be very worried about de-skilling. 282 00:23:42,370 --> 00:23:46,910 And probably I still should be, but I had an experience as well that made me 283 00:23:46,910 --> 00:23:48,190 a little bit less afraid of that. 284 00:23:48,330 --> 00:23:52,890 So in KDE, we have a mentor program. We have a website that lists a number of 285 00:23:52,890 --> 00:23:57,530 members of the community who you can contact on a one-on-one basis if you want 286 00:23:57,530 --> 00:24:01,570 to ask questions or you want some tutoring on how to do KDE development. 287 00:24:02,050 --> 00:24:08,750 And there was a young person who reached out to me and they wanted to be a KDE contributor. 288 00:24:08,750 --> 00:24:12,010 Contributor they had had some limited programming training 289 00:24:12,010 --> 00:24:20,210 but really they were starting out with c++ and this being 2022 i guess of course 290 00:24:20,210 --> 00:24:24,850 they were using chat gpt already because you know it's kind of hard to ignore 291 00:24:24,850 --> 00:24:29,530 if you are trying to climb a mountain like that so we made the agreement that. 292 00:24:30,690 --> 00:24:34,590 Yes he could use chat gpt for this but we would share an account and i would 293 00:24:34,590 --> 00:24:39,070 be able to read what he asks the AI so that I can steer him clear of going into 294 00:24:39,070 --> 00:24:40,390 a completely wrong direction. 295 00:24:40,650 --> 00:24:44,630 And it was pretty interesting to see him interact with the system. 296 00:24:46,230 --> 00:24:52,250 And of course, it was very hard for him to keep up the discipline to try and do things by himself. 297 00:24:52,450 --> 00:24:57,710 He would lean very heavily on code generation in the end, and it would become 298 00:24:57,710 --> 00:24:59,370 sort of a cycle of, well, this doesn't work. 299 00:24:59,450 --> 00:25:02,470 So I go back to the model immediately and I ask it to fix it up or make changes. 300 00:25:02,470 --> 00:25:06,250 And it falls apart pretty quickly there. 301 00:25:06,310 --> 00:25:08,530 You can ask it to generate something once, that sort of works, 302 00:25:08,630 --> 00:25:12,450 but integrating a hundred interactions over time doesn't yield a useful program. 303 00:25:12,670 --> 00:25:18,210 And he started recognizing at some point that he would run into a type of vault 304 00:25:18,210 --> 00:25:23,250 that he just couldn't leap over anymore because he didn't end up understanding 305 00:25:23,250 --> 00:25:26,730 the code file that he had produced in that patchwork fashion. 306 00:25:26,970 --> 00:25:32,190 And he came by himself to the conclusion that whenever I do it manually and 307 00:25:32,190 --> 00:25:36,790 I fight that three-hour battle to produce one line of code, at the end of that, 308 00:25:36,830 --> 00:25:40,250 I actually understand what it does and why it is fashioned that way. 309 00:25:40,490 --> 00:25:44,190 And after about half a year, he was like, this is not mentally healthy for me. 310 00:25:44,210 --> 00:25:46,490 I will stop using this crutch. So I think. 311 00:25:47,717 --> 00:25:52,957 I'm an optimistic person and I think also around this technology society will 312 00:25:52,957 --> 00:25:58,937 come to a point where sort of the use it or lose it of it all is apparent. 313 00:25:59,217 --> 00:26:04,057 It's just, you know, using stairs instead of an elevator occasionally or things like that. 314 00:26:04,337 --> 00:26:07,757 And particularly with writing, I agree with you. 315 00:26:07,837 --> 00:26:12,017 I mean, I even turn off the word completion on my phone keyboard, 316 00:26:12,077 --> 00:26:14,937 which this is a sophisticated version of, right? 317 00:26:14,937 --> 00:26:18,197 Because or spell checking because I've always 318 00:26:18,197 --> 00:26:21,537 felt like my orthography will just suffer if I 319 00:26:21,537 --> 00:26:25,417 don't learn how to spell myself and I think a lot of people come to the same 320 00:26:25,417 --> 00:26:29,417 conclusion so I'm not that worried about it I think we'll learn how to keep 321 00:26:29,417 --> 00:26:33,977 it under control of course I mean in in a company I guess you would want to 322 00:26:33,977 --> 00:26:38,857 measure whether this is an efficient way yeah letting people run into walls, 323 00:26:39,517 --> 00:26:41,517 and take half a year to do so, 324 00:26:42,637 --> 00:26:44,717 there might be more efficient ways of training them up. 325 00:26:44,777 --> 00:26:48,937 Yeah, but I mean, particularly if you're an employer, I think you should also 326 00:26:48,937 --> 00:26:53,057 learn to recognize the value of investing in your people. 327 00:26:53,237 --> 00:26:58,957 And I mean, obviously training is a thing. And the absence of using AI tools 328 00:26:58,957 --> 00:27:03,177 may constitute training that gives you a longer lasting value over a person 329 00:27:03,177 --> 00:27:08,197 if you end up shaping their intellect rather than turning them into a prompt engineer, 330 00:27:08,197 --> 00:27:15,437 for example yeah i hope so at least and it brings us to some of the. 331 00:27:17,417 --> 00:27:20,637 Scenarios that the world is talking about right anything from, 332 00:27:22,437 --> 00:27:27,317 the worst the world's going to end to this is the magical thing that's going 333 00:27:27,317 --> 00:27:32,597 to solve all of our problems tomorrow um and let's start maybe a bit more with 334 00:27:32,597 --> 00:27:36,237 the positive side of that or the more optimistic side of that. 335 00:27:37,197 --> 00:27:41,997 What gets you excited right now and hopeful about current AI development? 336 00:27:45,297 --> 00:27:51,337 Okay, I go first. I mean, for me this is a fascinating, super exciting time. 337 00:27:51,817 --> 00:27:56,017 Obviously, there are things possible in my field that just weren't possible 338 00:27:56,017 --> 00:28:02,397 five years ago. And there's a huge open vista of things to explore. 339 00:28:03,537 --> 00:28:05,697 So that obviously is exciting. 340 00:28:08,297 --> 00:28:12,997 It's also on the sort of more applied side, it will also be super interesting 341 00:28:12,997 --> 00:28:15,697 to see how people turn this into actual products. 342 00:28:16,849 --> 00:28:21,869 And what I always say is these models, and again, this is difficult to understand 343 00:28:21,869 --> 00:28:28,909 if you don't know these models more closely, they are kind of doomed to be generalists, right? 344 00:28:28,989 --> 00:28:36,629 Because they are, after all, on some level of description, they are autocomplete on steroids, right? 345 00:28:37,489 --> 00:28:40,409 And so they will complete anything and everything you give them. 346 00:28:40,469 --> 00:28:43,029 You give them a prompt and they complete it in some way. 347 00:28:43,849 --> 00:28:47,789 And if that's not what you want in your application, because you only want to 348 00:28:47,789 --> 00:28:52,269 produce a summary or you only want to do something else, you have a hard task 349 00:28:52,269 --> 00:28:57,329 of designing out this generality again. 350 00:28:57,529 --> 00:29:01,309 But I think the way forward to actual products is to do exactly that. 351 00:29:01,469 --> 00:29:06,429 And it's what I'm trying to get at with my metaphor of them being function approximators. 352 00:29:07,369 --> 00:29:11,149 So I think also on On the applied side, there's a lot of super interesting work, 353 00:29:11,309 --> 00:29:13,989 especially in the user interface space. 354 00:29:14,489 --> 00:29:23,949 I don't think the free chatbot with a weird, quirky personality type interface 355 00:29:23,949 --> 00:29:26,429 is the final form of this technology. 356 00:29:26,789 --> 00:29:30,789 This is just the first thing that people came up with because it's just so natural. 357 00:29:31,109 --> 00:29:34,189 You can train this behavior in very easily. 358 00:29:35,109 --> 00:29:37,109 But to be really useful... 359 00:29:38,489 --> 00:29:44,169 In sort of standard, normal, accepted ways where you measure the usefulness 360 00:29:44,169 --> 00:29:51,769 other than, wow, this is a funny-sounding limerick or I'm impressed by how cute 361 00:29:51,769 --> 00:29:52,649 this sounds or whatever. 362 00:29:53,309 --> 00:29:59,109 These are not good metrics for measuring actual utility in real context. 363 00:29:59,489 --> 00:30:05,589 So I think there's a lot to be done there, hopefully by some of you here. 364 00:30:10,493 --> 00:30:18,153 So I think what excites me about it comes down a little bit to how I sort of 365 00:30:18,153 --> 00:30:24,373 conceive of the role of what an engineer does in society or what we do broadly do at KDE. 366 00:30:24,813 --> 00:30:30,933 So when I was a young boy, I used to go to the Berlin Natural History Museum a lot. 367 00:30:30,973 --> 00:30:33,973 I requested to go there every birthday that I had. 368 00:30:33,973 --> 00:30:39,373 And the Natural History Museum has a wonderful expo on sort of stone age civilization 369 00:30:39,373 --> 00:30:41,393 and then how people used to live, 370 00:30:41,953 --> 00:30:49,273 many thousands of years ago and you get to see the tools that that society used, you know. 371 00:30:49,993 --> 00:30:54,433 Passion stones to make fire and so on and it occurred to me even as a kid that 372 00:30:54,433 --> 00:30:58,953 somebody had to think of making that stone and then produce it and give it to 373 00:30:58,953 --> 00:31:03,613 others and then they might use it to make fire and then they sit around that fire and 374 00:31:03,973 --> 00:31:07,133 i don't know start speaking and boom civilization 375 00:31:07,133 --> 00:31:14,853 and i think that is sort of um what i aspire to do in its modern incarnation 376 00:31:14,853 --> 00:31:19,913 to build tools that allow other people to live their life a little bit better 377 00:31:19,913 --> 00:31:26,953 and sort of spread civilization and culture and i think ai and large language models 378 00:31:27,053 --> 00:31:34,113 are such a heated topic because they promise to help us to make better tools, 379 00:31:34,293 --> 00:31:39,893 for example, allowing people to accomplish sophisticated tasks without specialized 380 00:31:39,893 --> 00:31:43,713 training, because they maybe bridge that natural language understanding gap, 381 00:31:43,853 --> 00:31:47,253 the kind of do what I mean instead of what I say part. 382 00:31:47,453 --> 00:31:51,153 But at the same time, they seem to, to some, to sort of encroach on that humanity. 383 00:31:51,253 --> 00:31:52,813 So rather than help, they supplant. 384 00:31:52,973 --> 00:31:55,713 And I think that's where a lot of the emotion comes from. 385 00:31:56,233 --> 00:32:01,153 What I liked about your presentation and your approach at all is to sort of 386 00:32:01,153 --> 00:32:05,913 try to turn that heat level of emotion down a bit and pare it down to what does 387 00:32:05,913 --> 00:32:09,793 it actually do, what is the mechanism here, what are the limitations of it. 388 00:32:10,033 --> 00:32:16,413 And I think for developers, it's very useful to think of it as sort of a pure 389 00:32:16,413 --> 00:32:21,253 function that accepts fuzzy parameters and you can't always trust the return value. 390 00:32:21,613 --> 00:32:25,973 I think that that's quite useful. but what that allows you to do is very powerful. 391 00:32:26,173 --> 00:32:31,973 I mean, what you can now accomplish as a developer, let's say building a speech 392 00:32:31,973 --> 00:32:36,913 system on a weekend as opposed to taking five years at Amazon with 300 people 393 00:32:36,913 --> 00:32:39,473 is pretty stunning. So that's exciting. 394 00:32:43,193 --> 00:32:48,373 Now we talked a bit about the positive and hopeful side, but there's also a 395 00:32:48,373 --> 00:32:51,813 lot that gives people pause. What is that for you? 396 00:32:53,624 --> 00:33:01,604 Right. Yeah, I mean, as I said, these are powerful tools, or at least they appear to be powerful tools. 397 00:33:01,824 --> 00:33:04,984 They are very confident themselves that they are powerful tools. 398 00:33:07,504 --> 00:33:14,784 And it gives people the temptation to build powerful applications and to shoot 399 00:33:14,784 --> 00:33:18,264 themselves very powerfully in the foot with them. 400 00:33:20,464 --> 00:33:25,764 And there's also a lot of money in the system. like a lot of money i i wanted 401 00:33:25,764 --> 00:33:27,964 to say that i didn't get to say this but a lot of these, 402 00:33:28,724 --> 00:33:31,544 these breakthroughs i have to admit this they're 403 00:33:31,544 --> 00:33:36,324 coming out of commercial labs they're coming out of industry labs not academic 404 00:33:36,324 --> 00:33:41,384 labs just because of the the insane capital expenditure that is needed to to 405 00:33:41,384 --> 00:33:46,244 build these things so there's a lot of venture capital in in there and at some 406 00:33:46,244 --> 00:33:49,224 point and some people want to see returns on it. 407 00:33:49,764 --> 00:33:55,604 So the temptation to build applications that are given more power than they 408 00:33:55,604 --> 00:34:00,824 should have, given the lack of guarantees, that is real. 409 00:34:01,004 --> 00:34:05,164 And that is the part where doom lies. 410 00:34:06,024 --> 00:34:12,544 It does not lie in chat GPT becoming conscious and trying to kill us all because 411 00:34:12,544 --> 00:34:15,104 we ask such inane questions all the time. 412 00:34:16,004 --> 00:34:21,504 It's people building applications that have a degree of autonomy that is not 413 00:34:21,504 --> 00:34:27,664 warranted and we know that complex systems are difficult and this can go off 414 00:34:27,664 --> 00:34:30,644 the rails quickly so that would be my worry. 415 00:34:33,180 --> 00:34:36,400 Yeah so i do think that around this technology we 416 00:34:36,400 --> 00:34:44,200 have um obviously a big problem with what you could call media literacy right 417 00:34:44,200 --> 00:34:48,800 as you pointed out users need to have a good model of how these systems actually 418 00:34:48,800 --> 00:34:54,300 work what their limitations are where your responsibilities as a user lie also. 419 00:34:56,420 --> 00:35:01,360 Perhaps whether it's actually good for you to use it or not and a lot of the 420 00:35:01,360 --> 00:35:06,000 commercial vendors in in this space are not incentivized, to be honest, 421 00:35:06,200 --> 00:35:08,160 about many of these things. 422 00:35:08,700 --> 00:35:13,400 Only to the extent that they are worried about liability will make them do it. 423 00:35:14,480 --> 00:35:19,440 I think for us, because we have a lot of freedom from those constraints, 424 00:35:19,700 --> 00:35:25,700 I mean, surely we also worry maybe about liability, but we get to worry about many more things. 425 00:35:25,700 --> 00:35:34,740 I think KDE broadly is in, our mission is to figure out what it means to make 426 00:35:34,740 --> 00:35:36,460 socially responsible software. 427 00:35:36,740 --> 00:35:41,020 And licensing for us is an important aspect of what socially responsible means. 428 00:35:41,120 --> 00:35:42,700 We have very strong ideas about that. 429 00:35:43,000 --> 00:35:50,360 And for example, we should also delve into what we think that means in the AI 430 00:35:50,360 --> 00:35:53,900 model space. place. But socially responsible also means sustainable, 431 00:35:54,060 --> 00:35:56,360 for example, or protecting privacy. 432 00:35:56,800 --> 00:36:02,320 And it can also mean giving users a realistic picture of how much AI is good 433 00:36:02,320 --> 00:36:07,420 for them and transparency over whether it's used or enabled or not and things like that. 434 00:36:07,720 --> 00:36:14,300 And I think that is what I expect of ourselves, that we should pay attention 435 00:36:14,300 --> 00:36:16,740 to this to avoid the doom perhaps. 436 00:36:19,352 --> 00:36:25,372 That brings us right back to KDE's role and maybe one last question before we 437 00:36:25,372 --> 00:36:26,332 go to audience questions. 438 00:36:27,912 --> 00:36:36,952 So are there areas in KDE where you see positive potential for using generative AI, 439 00:36:37,592 --> 00:36:43,552 in applications in the development in our community? 440 00:36:49,752 --> 00:36:55,212 So, yes, I mean, I have a laundry list of little like wish list items where 441 00:36:55,212 --> 00:37:00,112 I think even the current level of technology would enhance our applications. 442 00:37:01,152 --> 00:37:05,732 Maybe something that you would benefit from as an academic who has to read a lot of long PDFs. 443 00:37:05,732 --> 00:37:10,512 I would love our document viewer to be able to answer the question which page 444 00:37:10,512 --> 00:37:15,492 of that long PDF particular topic is being talked about rather than having to 445 00:37:15,492 --> 00:37:17,192 do a keyword search, which is very clumsy. 446 00:37:17,312 --> 00:37:24,532 So little things like that. But I think, again, it's not just the features. 447 00:37:24,712 --> 00:37:30,492 It's also the fact that we have an opportunity to do it differently from the other players. 448 00:37:30,652 --> 00:37:35,252 I mean, we have a goal, for example, to protect the user's privacy that makes 449 00:37:35,252 --> 00:37:36,612 us much more interested, I think, 450 00:37:36,692 --> 00:37:40,632 sort of in running these technologies locally rather than on a server. 451 00:37:40,712 --> 00:37:48,392 And the other players are not doing it that much. So I think there we can offer 452 00:37:48,392 --> 00:37:51,412 something substantially different to the user as well that I think a lot of 453 00:37:51,412 --> 00:37:54,912 users want and I'm hoping that we get to do that. 454 00:37:56,397 --> 00:38:02,117 Maybe one shot. So I think you have an opportunity since you don't have commercial 455 00:38:02,117 --> 00:38:08,877 pressures so much at all, you have an opportunity to also lead in what you're not doing. 456 00:38:09,897 --> 00:38:18,437 So I think a very lazy way of integrating AI or LLMs is to have a button that calls ChatGPT. 457 00:38:18,717 --> 00:38:26,697 And that obviously requires no thought at all and does not address any of those 458 00:38:26,697 --> 00:38:28,777 problems that I tried to highlight. 459 00:38:29,097 --> 00:38:32,757 So I think you have a way of doing things more clever than this. 460 00:38:33,577 --> 00:38:39,837 And as I said, my guess would be that trying to cut down on the generality and 461 00:38:39,837 --> 00:38:47,277 really investigating a use case and making sure that this one use case is actually implemented well, 462 00:38:48,097 --> 00:38:50,237 that's an opportunity, I think. 463 00:38:51,077 --> 00:38:54,337 I agree with you. I think one of 464 00:38:54,337 --> 00:38:59,317 the interesting things about KDE's software and its technology stack is that 465 00:38:59,317 --> 00:39:03,097 we've always been very driven to create sort of libraries and frameworks that 466 00:39:03,097 --> 00:39:09,857 are used all across our applications and that make them able to interact with 467 00:39:09,857 --> 00:39:11,377 each other to some degree. 468 00:39:11,597 --> 00:39:15,477 And you can query a lot of information about these applications as they're running. 469 00:39:15,477 --> 00:39:19,377 And I think there's a lot of opportunity there to find sort of these more surgical 470 00:39:19,377 --> 00:39:25,797 places where can you conceive of a semi-intelligent function that you could 471 00:39:25,797 --> 00:39:28,317 sort of lock in between it, it would do something useful. 472 00:39:28,497 --> 00:39:34,097 That is not just bring up text box and stuff like that, but that selects usefully 473 00:39:34,097 --> 00:39:36,657 among things it's exposed to, for example. and things like that. 474 00:39:37,357 --> 00:39:40,717 I hope that, and this is a really interesting developer problem, 475 00:39:40,997 --> 00:39:46,477 what would be a good API for developers to run AI jobs like that? 476 00:39:46,737 --> 00:39:49,917 And how can they do things together? That's very interesting. 477 00:39:52,004 --> 00:39:54,844 All right, thank you so much. Then let's take some audience questions. 478 00:39:55,624 --> 00:39:58,384 Who has questions? All right, let's start with you. 479 00:40:01,284 --> 00:40:06,344 Oh, is this working? Okay, so when it comes to large language models, 480 00:40:06,844 --> 00:40:11,904 this does not happen to all fields of the AI, but it does happen here. 481 00:40:12,864 --> 00:40:21,164 One of the most serious problems there is that it needs a lot of data. 482 00:40:21,164 --> 00:40:23,004 It needs a lot of computing power. 483 00:40:23,184 --> 00:40:30,384 And we, as free software developers, we really don't have the resources to purchase them. 484 00:40:30,924 --> 00:40:39,804 So, and also, we probably won't want to use a service that is proprietary and hosted by others. 485 00:40:40,024 --> 00:40:45,024 And, you know, although they might have some privacy policy, 486 00:40:45,164 --> 00:40:52,664 we might still not like it. We don't want to give our data to just some random third party. 487 00:40:53,364 --> 00:40:55,864 What do you think can solve this problem? 488 00:40:57,044 --> 00:41:04,284 Yeah, this is a real problem. These models only are as good as they are because 489 00:41:04,284 --> 00:41:05,964 they have ingested a lot of data. 490 00:41:07,004 --> 00:41:12,444 And it's not a linear curve in the capabilities, right? 491 00:41:12,504 --> 00:41:16,304 You just need a lot of data. It's not that if you use half as much data, 492 00:41:16,384 --> 00:41:21,924 it's half as good, there is a certain amount of data you absolutely need to 493 00:41:21,924 --> 00:41:23,704 get the basic capabilities off the ground. 494 00:41:25,444 --> 00:41:31,464 And that puts the training from scratch of these models out of the hands for 495 00:41:31,464 --> 00:41:33,524 the foreseeable future, out of the hands of... 496 00:41:36,259 --> 00:41:42,239 Of organizations without a lot of access access to a lot of capital um yeah 497 00:41:42,239 --> 00:41:48,039 there's a problem there are some um some organizations for example allen ai 498 00:41:48,039 --> 00:41:53,199 has just released the new version of the olmo model they are quite open about 499 00:41:53,199 --> 00:41:55,759 the data they're using and they've released all the 500 00:41:55,799 --> 00:42:06,699 checkpoints and and and so on um but there is of course llama by by meta which has a, 501 00:42:07,499 --> 00:42:12,799 slightly weird license but it might be possible for for you to to use it but 502 00:42:12,799 --> 00:42:17,359 then you have to live with the fact that this has been paid for by horrible 503 00:42:17,359 --> 00:42:22,599 uh deeds uh done by meta so yeah this This is a problem. 504 00:42:24,039 --> 00:42:33,919 I don't think training will ever become a possibility for normal people or academics, 505 00:42:34,099 --> 00:42:37,619 for not otherwise funded entities. 506 00:42:38,899 --> 00:42:44,859 If that rules it out completely for you, then that is a stance as well that needs to be discussed. 507 00:42:45,379 --> 00:42:51,759 But yeah, this is a deep problem at the root of it. So I think the provenance 508 00:42:51,759 --> 00:42:55,379 of the training data is for our community. 509 00:42:57,010 --> 00:43:00,410 Topic that we think about a lot obviously because um 510 00:43:00,410 --> 00:43:03,130 again licensing is a very important 511 00:43:03,130 --> 00:43:06,610 topic for us in free software you could say that creative use 512 00:43:06,610 --> 00:43:13,170 of the copyright system is uh one of the cornerstones of how we were going and 513 00:43:13,170 --> 00:43:21,530 therefore we feel that and from our hearts i think that that if you don't display 514 00:43:21,530 --> 00:43:25,130 integrity about protecting the rights of authors, 515 00:43:25,310 --> 00:43:30,790 then you're doing it from a free software person, since we rely on that mechanism so much ourselves. 516 00:43:31,390 --> 00:43:35,350 At the same time, obviously, a lot of us feel that we very intentionally release 517 00:43:35,350 --> 00:43:39,790 our code as open source because we want others to have access to it. 518 00:43:39,890 --> 00:43:44,390 So to some extent, we also care about there being a public domain. 519 00:43:44,630 --> 00:43:49,230 Our licensing doesn't correspond to the public domain, but we're adjacent to it. 520 00:43:49,430 --> 00:43:56,610 So, given all of that, I think what a lot of people in the free software community are looking for is, 521 00:43:56,710 --> 00:44:02,270 is there anywhere credible activity to produce a clean training data set with 522 00:44:02,270 --> 00:44:05,530 known provenance of the data that you can sort of use guilt-free? 523 00:44:06,070 --> 00:44:10,050 Presumably, as an academic, that's also something that you would have massive interest in. 524 00:44:10,050 --> 00:44:16,250 And I'm not, are you aware of any sort of solid effort to curate a sufficiently 525 00:44:16,250 --> 00:44:24,430 large training data set that produces a model that would not give us sort of doubt and worries? 526 00:44:25,010 --> 00:44:29,050 Yeah, as I said, there's this LNAI initiative, there's Olmo initiative. 527 00:44:29,770 --> 00:44:34,470 As far as I'm aware, they are documenting where the data is from and trying 528 00:44:34,470 --> 00:44:38,870 to make sure that at least under... 529 00:44:40,050 --> 00:44:42,610 Under generous interpretations of the OR. 530 00:44:43,910 --> 00:44:48,190 Anyway, it's still contested what copyright law says about training. 531 00:44:48,290 --> 00:44:51,330 But anyways, so they are at least transparent about the data. 532 00:44:51,450 --> 00:44:56,830 But this is only one part, right? I mean, having the data is great, 533 00:44:56,970 --> 00:44:59,630 and you have several terabytes of data, that's great. 534 00:45:01,050 --> 00:45:03,050 But then training a model on it 535 00:45:03,050 --> 00:45:09,550 that requires several million dollars in compute is the next step, right? 536 00:45:10,950 --> 00:45:17,070 So that might be a barrier as well. But starting from data, people are trying to address this. 537 00:45:17,290 --> 00:45:21,730 Yeah, but I think we're more fine with someone sort of donating the compute to us. 538 00:45:22,670 --> 00:45:27,070 But the data is, I think, bad. That gives us more thought. 539 00:45:27,310 --> 00:45:31,750 Maybe to answer your question or to do something with your question a little bit more. 540 00:45:33,390 --> 00:45:39,310 So you mentioned that you need a certain minimum amount of data to produce a 541 00:45:39,310 --> 00:45:45,950 model that is useful but as a sort of non-expert looking into the various releases 542 00:45:45,950 --> 00:45:48,270 that come out there certainly seem to be. 543 00:45:49,709 --> 00:45:55,529 A lot of people who are trying to build smaller models that maybe show much 544 00:45:55,529 --> 00:46:02,069 poorer performance and sort of knowledge queries but still have better reasoning 545 00:46:02,069 --> 00:46:03,949 performance than you would expect. 546 00:46:04,669 --> 00:46:09,489 From scaling the data down and when it comes to integrating a model into our 547 00:46:09,489 --> 00:46:13,069 software that can sort of act as a function and usefully select among things 548 00:46:13,069 --> 00:46:15,589 that it's sort of zero shot given as a prompt, 549 00:46:15,829 --> 00:46:21,189 maybe sort of a small enough model would actually be good enough for us because 550 00:46:21,189 --> 00:46:22,389 we're not looking for that text 551 00:46:22,389 --> 00:46:26,829 box chat interface or we don't aim to be a Wikitalia replacement, right? 552 00:46:26,909 --> 00:46:33,049 So how true is that notion that you as a model designer get to pick and choose 553 00:46:33,049 --> 00:46:37,369 a bit as in, yes, your training data set is smaller, but the model architecture 554 00:46:37,369 --> 00:46:40,569 still makes it useful as a function approximator? 555 00:46:40,729 --> 00:46:46,509 You have to be careful there. There are two different dimensions to it that 556 00:46:46,509 --> 00:46:52,649 are somewhat orthogonal or somewhat independent of each other. 557 00:46:52,829 --> 00:46:55,629 One is the size of the model in terms of parameters. 558 00:46:56,309 --> 00:47:02,389 And you can get smaller models that are quite okay, that have capabilities, 559 00:47:02,629 --> 00:47:04,569 but they still need a lot of data. 560 00:47:05,229 --> 00:47:10,729 And it turns out that over-training, the training with a lot more tokens than 561 00:47:10,729 --> 00:47:15,389 people initially thought would be needed, even on smaller models, creates better models. 562 00:47:15,689 --> 00:47:21,629 So these dimensions don't go... You still need a lot of data, 563 00:47:21,829 --> 00:47:24,029 even on a model with fewer parameters. 564 00:47:24,509 --> 00:47:26,809 I think we have two minutes, according to the sign. 565 00:47:28,941 --> 00:47:36,661 David, thanks for doing this. I would like to draw attention back to this slide. 566 00:47:37,221 --> 00:47:41,761 And I noticed that you talk about cultural sustainability, but you don't talk 567 00:47:41,761 --> 00:47:43,301 about environmental sustainability. 568 00:47:43,881 --> 00:47:47,921 Why is that? Is that sort of like, oh, yeah, we're going to have these... 569 00:47:48,721 --> 00:47:55,501 That's meant to be in cost externalization. That's meant to be in the environmental sustainability. 570 00:47:55,501 --> 00:47:58,681 Sustainability right okay either way you 571 00:47:58,681 --> 00:48:01,861 know that all this is wishful thinking right nothing of 572 00:48:01,861 --> 00:48:06,381 this is happening no there's any sign that this is going to happen doesn't this 573 00:48:06,381 --> 00:48:12,481 give you a kind of a pause of well i mean i didn't give you my negative vision 574 00:48:12,481 --> 00:48:18,981 for ai which is kind of which is kind of the inverse of this but um yeah but 575 00:48:18,981 --> 00:48:21,461 we have to fight for it i mean we have what What can we do? 576 00:48:22,541 --> 00:48:28,981 We can't just resign. We can identify areas where we would like to intervene. 577 00:48:30,801 --> 00:48:31,361 Yes. 578 00:48:33,361 --> 00:48:39,201 I'm here. That's my attempt at telling you to be aware of these issues. 579 00:48:39,441 --> 00:48:46,221 I talk to decision makers. I talk to politicians who have dollar signs, 580 00:48:46,341 --> 00:48:51,101 who have euro signs in their eyes when they talk about, think about AI, 581 00:48:51,361 --> 00:48:55,841 and I tell them, okay, so here are areas where you can really do something. 582 00:48:56,401 --> 00:49:00,061 Yeah, yeah, yeah, listen to their fingers, that is what's happening, 583 00:49:00,261 --> 00:49:03,201 but you are kind of enabling them, right? 584 00:49:05,701 --> 00:49:09,581 Yeah. I don't know, I... 585 00:49:12,821 --> 00:49:16,421 I mean, exactly, that's the thing. So I think the thought... 586 00:49:17,221 --> 00:49:21,561 Progress at any cost. No, no, no. What I mean is that I think, 587 00:49:21,601 --> 00:49:23,441 for example, now we're at KDE, right? 588 00:49:23,501 --> 00:49:30,161 I think the worst thing that we could do is ignore the topic because it is fraught with problems. 589 00:49:30,401 --> 00:49:37,561 And through a combination of hard work over 30 years and historical happenstance, 590 00:49:37,821 --> 00:49:41,661 we're one of a half dozen ways to use a computer, right? 591 00:49:41,661 --> 00:49:46,061 There aren't that many ways to interact with a PC. We're one of them. 592 00:49:46,781 --> 00:49:51,901 So I think that means we get to think about what stance we take and how to do 593 00:49:51,901 --> 00:49:55,881 it according to our value system and hopefully show people a better example. 594 00:49:56,281 --> 00:50:02,401 So I think you're very appropriate, right? We have a KDE Eco initiative. 595 00:50:02,641 --> 00:50:07,241 What does that mean in the AI context? And what does it mean to using AI and 596 00:50:07,241 --> 00:50:10,181 KDE software as a very good thing for us to think about? 597 00:50:10,181 --> 00:50:18,861 And we will continue I see the sign you know I'm ignoring we will continue this 598 00:50:18,861 --> 00:50:23,461 conversation in the hallway I'm sure we will thank you so much everyone and 599 00:50:23,461 --> 00:50:24,441 thank you for inviting us.