Spent half of his life on programming, for the last decade professionally in Java land. Loves back-end and data visualization. Passionate about alternative JVM languages. Disappointed with the quality of software written these days (so often by himself!), hates long methods and hidden side effects. Interested in charting, data analysis and reactive programming. Believes that computers were invented so that developers can automate boring and repetitive tasks. Also their own.
On a daily basis works in the e-commerce sector. Involved in open-source, DZone’s Most Valuable Blogger, used to be very active on StackOverflow. Author, trainer, conference speaker, technical reviewer, runner. Claims that code not tested automatically is not a feature but just a rumour. Wrote a book on RxJava for O’Reilly.
CharBusters – 10 Unicode Myths ACCEPTED
Character encoding, just like many other tasks in our industry, seems straightforward and predictable. But when you look close enough, the correct encoding of Turkish or Chinese characters is not that obvious. During this seemingly trivial talk, I’ll show you, how complex Unicode can get. What kinds of problems also related to security, we can expect. How many traps are buried in the specification and its Java implementation? For example, did you know that a single UTF-32 character can take as many as… 28 bytes? And did I mention the Oxford Dictionaries Word of the Year 2015, that is… ‘:joy:’ (yes, emoji, we’ll talk about them as well).
If you believe that char is the best type to encode characters in Java, come to my talk!
If you ever used String.length() to calculate the length of a String and you think it returns the number of characters – I’ll tell you the truth
If you don’t want people getting killed because of a Unicode-related bug in your application, I’ll teach you how. Yes, the unfamiliarity of Unicode can kill. Literally.