close
close

first Drop

Com TW NOw News 2024

LLM Agents, Text Vectorization, Advanced SQL and other must-reads from our newest authors
news

LLM Agents, Text Vectorization, Advanced SQL and other must-reads from our newest authors

If you’re a regular reader of the Variable, you may have noticed that we emphasize — every week — that TDS is always open to contributions from new authors. And we mean it! Some of you may have seen this post and thought something like, “Great, I’d love to write an article!” but then wondered what kinds of posts would be a good fit, what topics our readers are interested in, and what kinds of experiences and skills are welcome.

This week’s edition of Variable highlights some of our best recent writing, so if you don’t feel like becoming a TDS author, that’s totally fine! As always, we hope you enjoy your read. However, we’ve focused exclusively on posts from our most recent cohort of authors, in the hopes that their work inspires you to give it a try.

As you’ll see, TDS contributors come to us with a wide range of experience levels (from entry-level learners to PhDs and industry veterans), interests, and writing styles. What unites them is great storytelling skills and a desire to share their knowledge with a broader community. We hope (and are pretty sure) that you’ll enjoy our weekly lineup.

  • What do large language models ‘understand’?
    When we attribute human abilities to LLMs, we engage in an anthropomorphic bias by comparing their abilities to our own. But do we also display an anthropocentric bias by failing to recognize the abilities that LLMs consistently demonstrate?” In one of the most thought-provoking articles we’ve read recently, Tarik Dzekman addresses the question of LLMs’ ability to understand language, looking at the topic from a philosophy- and psychology-informed lens.
  • Integration of LLM agents with LangChain in VICA
    “Our goal is to say goodbye to the robotic and clunky form-like experience within a chatbot, and hello to personalized conversations with human assistance.” Ng Wei Cheng and Nicole Ren share practical insights and lessons learned from their extensive work on Singapore’s GovTech Virtual Intelligent Chat Assistant (VICA) platform.
  • Text vectorization demythologized: turning language into data
    For those of us who are aware of the machine learning pipeline in general, we understand that feature engineering is a very crucial step in generating good results from the model. The same concept applies to NLP as well.” Lakshmi Narayanan provides a thorough overview of text vectorization approaches and weighs their respective advantages and limitations.

LLM Agents, Text Vectorization, Advanced SQL and other must-reads from our newest authorsPhoto by Totte Annerbrink on Unsplash

  • Use Gemini-1.5-Pro-Latest for smarter eating
    It is worth noting here that with the advancements in the world of AI, it is the responsibility of data scientists to gradually transition from traditional deep learning to generative AI techniques to revolutionize their role.” Mary Ara presents an end-to-end project walkthrough showing how to do just that – in this case, by creating a calorie tracking app that uses a sophisticated multimodal model.
  • The Most Useful Advanced SQL Techniques to Succeed in the Technology Industry
    “While it is relatively easy to master basic and intermediate SQL, it is sometimes challenging to master the tool and use it proficiently in a variety of scenarios.” Jiayan Yin aims to help data analysts and other professionals close that skills gap with a comprehensive overview of the more advanced SQL techniques you should add to your query toolkit.
  • Refine the audio spectrogram transformer with Hugging Face transformers
    “This process adapts the model’s capabilities to the unique characteristics of our dataset, such as classes and data distribution, ensuring the relevance of the results.” At the intersection of machine learning and audio data, Marius Steger describes a detailed workflow for fine-tuning the Audio Spectrogram Transformer (AST) on any audio classification dataset.
  • Algorithm-agnostic model building with MLflow
    “Imagine this scenario: We have a sklearn model that is currently in production for a particular use case, and later we see that a deep learning model performs even better. If the sklearn model is deployed in its native format, the transition to the deep learning model can be difficult because the two model artifacts are very different.” Mena Wang, PhD, explains why working with algorithm-agnostic models can sometimes make a lot of sense, and shows you how to get started in MLflow.
  • A new look at nonlinearity in deep learning
    But why do we need activation functions, specifically nonlinear activation functions, in the first place? There is a traditional reasoning, and also a new way of looking at it.” Harys Dalvi unpacks the stakes of using a linear layer for the output of deep learning classifiers, and the value we can get from interpreting the consequences of linearity and nonlinearity in multiple ways.

Thank you for supporting the work of our authors! As we said above, we love publishing articles from new authors. So if you have recently written an interesting project walkthrough, tutorial or theoretical reflection on one of our core topics, please do not hesitate to share it with us.

Until the next variable,

TDS Team


LLM Agents, Text Vectorization, Advanced SQL, and other must-reads from our newest authors was originally published in Towards Data Science on Medium, where people continued the conversation by bookmarking and commenting on this story.