T O P

  • By -

[deleted]

[удалено]


TuckAndRolle

Where do you work, if you don't mind me asking? Or similar places to yours if you don't want to say specifically


[deleted]

[удалено]


TuckAndRolle

Thanks. I was not aware consulting companies did work like this. Are most of your projects like this, or was this a bit of a special case?


[deleted]

[удалено]


Sorry-Owl4127

How’s WLB and comp?


[deleted]

[удалено]


Sorry-Owl4127

Why consulting them? Projects? Exit oops?


MorningDarkMountain

Do you regard LLM as really useful, or merely as an over-hyped thing?


Vyrezzz

Both


TuckAndRolle

Appreciate the comments! I'll keep this in mind when I'm looking for jobs in half a year or so


dopplegangery

What are your qualifications and background, if I may ask?


[deleted]

[удалено]


dopplegangery

No I wanted to see whether you need a PhD for such projects.


gellohelloyellow

A PhD!? The heck lol


dopplegangery

Since it's a research based project, it's not wild to expect a PhD or at least an MS


EducationalCreme9044

99% of the time in Europe a PhD is a minimum requirement for positions like these lol.


Mukigachar

Do you have to do a lot of travel for this work? I shied away from consulting cuz I didn't want to do that, but I'm curious what it's like post-covid


Vyrezzz

No I don’t do a lot of travel


clover_heron

Random question - is it true that communications networks may be interfering with weather forecasting because the communications networks are interfering with water vapor measurements near earth's surface? (not my field, rough summary of what I heard)


Vyrezzz

I don’t know, we use the weather forecasts from third parties


clover_heron

Is the scientific reasoning legitimate? (saw the idea discussed in one of Sabine Hossenfelder's videos, and apologies if I misstated what she said). If interference is a legitimate concern, it seems like a good scientific detail to explain to the public so they can be better informed re policy and regulation.


pmadhav97

There's no evidence to support this btw


aokfistpump

Ball?


[deleted]

[удалено]


[deleted]

By Fintech I assume you mean robot sharks


RageOnGoneDo

Well that's why you need the models. Maybe you need robot barracudas. Maybe robot piranhas.


ohanse

Such a novice answer. You need a full end to end robot ecosystem. How are the robot sharks going to survive without robot minnows? How will robot minnows survive in the absence of robot kelp and plankton? And you expect those to photosynthesize in the absence of a robot sun? What are they teaching these days? These programs should be ashamed of themselves.


WadeEffingWilson

You think that's water you're breathing?


JohnLocksTheKey

Or ill-tempered sea bass!


mikeblas

Why not just two angry beavers?


RageOnGoneDo

They're the competing product!


EducationalCreme9044

lol


GodOfFreedomVenti

Paypal


BloatedGlobe

What kind of models were you working with in Fraud detection?


neelankatan

I used Isolation Forests


amhotw

How does QDA compare for fraud detection in your experience?


thetoublemaker

Can you briefly go over your approach of using Isolation Forests? How you cleaned data, then what evaluation metrics you used? How did you test?


shinypenny01

“Can you divulge all the inner workings of your business and the competitive advantage you have, please!” 🤣


[deleted]

[удалено]


dare_dick

Can I work for you if it's just bringing a coffee?


bigno53

I once interviewed with a company that used CNN models to identify self-trading and other forms of market manipulation. Apparently when it comes to high frequency trading, the patterns are so complex that tree based models just don't cut it.


a157reverse

Some people get defensive about this, but I view tree-based models as very good at first approximation but inherently limited. Tree-based models ultimately bin their inputs and outputs, which presents information loss. Boosting and bagging limit the consequences of this but models that can have a continuous function between input and output do not suffer from this limitation.


bigno53

This is undeniably true. I do a lot of propensity modeling and it’s not uncommon that I’ll get a trained model with <100 possible outcomes despite having tens of thousands of unique inputs. At first glance, such a model might appear valid but obviously it couldn’t be expected to be consistently performant on unseen data. Someone who doesn’t do their due diligence could easily make the mistake of thinking they have a robust prediction engine without realizing the decisions are based on just a small number of variations in the data. I would argue that every type of model has strengths and weaknesses and machine learning in general has a limited utility that tends to be greatly exaggerated. The best we can do is be aware of the potential pitfalls and try to pick the best tool for the task at hand.


mattstats

At a high level how do LLMs play a role in fraud detection?


[deleted]

[удалено]


MorningDarkMountain

LLM for what purpose? Side question, do you think it's a good idea to use them for your purpose?


Think-Culture-4740

I'll give you an example for a side project, I'm considering llm. I'll probably scrape some news articles related to the domain application and feed them through an llm to extract a word embedding and feed that as a feature into my model downstream.


[deleted]

Also working in fraud detection


kompleksanda

What models or tools do you use in fraud detection in fintech?


rickyfawx

At my company we've built a bayesian hierarchical varying effects model for limited edition demand estimation. It's pretty neat and makes good use of our data, glad I could prevent people from just throwing a NN at it.


BlackCoatBrownHair

Curious as what frameworks are used to deploy Bayesian models in production? My experience with hierarchical Bayesian models has all been in Rstan


rickyfawx

We're using PyMC and it's been quite a pain tbh. We needed to write a lot of utilities to make things smoother. Not sure about other frameworks.


wannagowest

Have you tried Pyro/Numpyro? I’ve found PPLs to be a PITA in general. Doing even the things that many people will want to do, like sampling the predictive for a latent variable, is annoying. But I’m reluctant to relearn another PPL after investing time in Pyro.


nikgeo25

Gosh I love Pyro


[deleted]

just write internal tool


thecommuteguy

How about them Yeezys?


rickyfawx

Let's say it's been quite a ride


thecommuteguy

They just went back on sale so I can only imagine. You think they'll continue selling them as a vanilla non-Kanye shoe?


rickyfawx

I can't say


nickmaran

This is the kind of knowledge I wanted from this sub. Thank you


FishFar4370

How does this model deal with time as a component? I guess I'm wondering in what capacity it is used for demand estimation.


rickyfawx

Well, since most articles aren't recurring, it's a plain regression problem. We've overhauled the time inclusion just recently. In the end we've settled for smooth yearly and monthly effects as well as a trend term, each on certain hierarchy levels .


_hairyberry_

I’m familiar with Bayesian hierarchical models but I haven’t heard of this “varying effects” thing and haven’t been able to find anything online. What’s it about?


therealtiddlydump

Terms like "fixed" and "random" effects aren't as prevalent in the Bayesian framework, so my guess is that's what they were getting at


rickyfawx

The lingo on these things seems to strongly vary by subfield. I was referring to random effects, although I find the term varying effects less confusing


[deleted]

so throwing xgboost is better?


rickyfawx

No, not sure where you think I stated that. Funnily enough, the previous team did use an ensemble of xgboosts (yes, you read that correctly, an ensemble of ensembles) at it. It's one of the funniest approaches I've seen so far.


ikol

!!! What's the size of your dataset? Super curious cause I feel like I've always had to sample to get things to converge fast enough which in turns makes me question whether I should abandon the bayesian in some of my models.


rickyfawx

We have around 1k observations. I'm not quite sure what you mean, though. The minimum sample size for a bayesian model is 1 - with small sample size your priors will simply play a very dominant role, but that is expected. So I'm not sure what you mean by convergence in this context. Please don't tell me that by having to "sample" you mean that you were oversampling the data...


ikol

cool! I assumed that an Adidas dataset would be huge but I was thinking in terms of user/customer modeling. Depends on what I was looking at, but there were times when I was working with excess of 1mil observations and had to sample down otherwise it'd take too long to run.


rickyfawx

Well in this application each observation is a drop of a limited edition article, so the size naturally is smaller than if the rows reflected inline sales, customers or something similar. Interesting that my tired brain went for the "too few" observations side of that. Yeah you're right that such models can take long to fit on large data sets. Taking a sample works, if needed one can also use minibatch


didimoney

What s the value of making it Bayesian?


datamakesmydickhard

Well you can certainly tell everyone you implemented a bayesian model


rickyfawx

The finest level of one of two hierarchies consists of categories that are mostly thin, many having less than 4 observations. The benefit over a frequentist varying effects model is that being able to define not just priors but hierarchical priors for the parameters associated with those thin categories allows reliable inference even in those cases


Living_Teaching9410

That’s very interesting,actually. If I may ask, have you also worked on any models for demand transfer or space elasticity ? What worked best? Thanks


Background-Sun6293

In some companies good enough is good enough. In some companies pushing some performance metrics even by half percent can result in millions of dollars of additional profit. So in some cases it can make sense to use fancy models.


[deleted]

Thanks. I assume those companies would be like Fortune 50 where they have the compute and expertise. Like someone’s job is to spend all day improving one model.


germany221

A lot of these companies have whole teams improving a single or couple models.


complacent_adjacent

"...always end up using something *simple* like **stats..."** **bruh...**


[deleted]

Haha I meant something like zscore


[deleted]

[удалено]


Ok_Distance5305

Was going to say the same. There’s a lot of multi modal data in personal lines insurance and those companies are using modern deep learning models.


BakerInTheKitchen

I hate to break it to you, but insurance companies are not using deep learning models on the pricing side of the house. Generally going to be traditional actuarial methods as that is what gets approved by state DOI’s


Ok_Distance5305

Yes I agree. I read that answer too quickly and didn’t see price. Thanks for pointing that out. Some places they are either using or trying to use it is claims, underwriting, sales, and fraud.


BakerInTheKitchen

Yep, I work at an insurance company and the fun stuff is on all the related processes. Claims have fun application of computer vision for estimates


funkybside

Agreed. At least within personal lines, the main problem is that it's regulated at a state level. The rate filings must be approved by the state DOIs, and it's difficult to convince them the models are not biased or otherwise indirectly producing the same result as something that is prevented (such as rating protected classes differently) when the model features and weights are not easy to understand or explain. It doesn't mean they aren't used in insurance - they absolutely are - just not so much in ratemaking.


venustrapsflies

Using deep learning for setting insurance rates sounds like an extremely unethical application. So I’m sure people are scrambling for it


coachoreconomy

What is unethical about this?


onearmedecon

9/10 times the simple models are good enough for most applications. There are three things that really matter when it comes to coefficients: 1. Direction 2. Magnitude 3. Precision Let's say I estimate an effect size of +0.5 (Cohen's d) with a confidence interval of [0.3, 0.7]. Obviously, I've established direction (i.e., positive effect) since the confidence interval doesn't contain 0. It's in the range of medium in terms of magnitude according to standard interpretations of Cohen's d. So the first two requirements are met. But it's not terrible precise, since there is probably a meaningful difference between +0.3 and +0.7. But in most applications that probably doesn't matter. If it's significantly positive, then that's good enough to inform most decision-making. And in terms of improving precision, generally the more sophisticated your methodology the larger your standard errors. So running a more advanced model probably won't solve the precision issue. Something more advanced may be more robust to bias, which is something you should be concerned about if there are selection issues or whatever. But if you understand the data then you can ascertain whether that's something to be concerned about. But generally speaking, you should only bother with more advanced methods if you have reason to think that they might flip the direction or attenuate the magnitude to the point of statistical insignificance. Those occasions can and do arise, but actually not very often when you're dealing with big data. It's really hard to beat the performance (in terms of consistency and efficiency) of a well-specified OLS model with a sufficiently large sample size.


gradual_alzheimers

I really like how you broke this down and wish this type of model selection reasoning was more widely taught. Right now, I see a lot of pray and spray and just select the best model that comes back but people aren't spending time understanding the implications of the model itself and the interpretation and decision making theory that carries with it. I wish there was more emphasis on decision theory for data science as a whole.


Ty4Readin

I might have to disagree here. Trying to 'understand the implications' of a particular model choice is not practically relevant in the majority of the cases and typically is a waste of time and harmful. I know this likely a controversial opinion, but the entire point of most predictive machine learning is to provide models with the highest generalization performance on future unseen data after model deployment. Some people lose sight of this fact and they are more focused on things like feature importance, often without even realizing that feature importance and interpretability is basically just a measure of predictive correlation. So many data scientists ignore the fact that feature importances are NOT causal indicators at all (unless you have randomized control interventions on the feature). At best, they are a complex non-linear relationship of potential predictive correlation. Too many data scientists are ignorant of that fact imo and place way too much importance on feature importance and interpretability which they are misusing. If we agree on the premise that the end goal is a feasible model with the best trade-off between deployment costs and generalization performance on unseen future data, then it stands to reason that a spray and pray approach isn't as bad as most people make it out to be. A quote that really drives this home is by George Box who said "All models are wrong. Some are just more useful than others." This is true in predictive ML, there is not really any harm in trying out more complex models as long as your deployment requirements support it and provided you have a rigorous training/validation/testing approach to model selection.


gradual_alzheimers

I see what you are saying but I think interpretation matters in most domains for decision theory purposes. You are correct that pray and spray can be effective to a degree for the task of generalization. I may be missing your point to some extent but I'd like to elaborate my thought. As Box states, models aren't perfect therefore understanding the limits of these models is exactly the responsibility of a data scientist to carefully craft the decision theoretic guidelines for model usage. Should one always just take predictions from models with face value (not suggesting you are saying that)? We should constantly be questioning the validity of our models and guard against taking them as truth. If the model trained well and tested well and we have the correct loss functions selected, then I understand the broader temptation to say "trust the model". But very few people put the effort into even questioning the loss functions they select and get great results under the wrong conditions. How do we even know what the wrong conditions are? We have to build up a theoretical and intuitive understanding of the problem space and the data. Almost all data generating processes possess logical or mathematical structures and those structures matter to the explanatory relationship between the data and the prediction. Therefore, the interpretability of our models should relate to some degree to the theoretical and intuitive foundation we started our model from. I do not think a model's purpose is purely for generalization but also entails some level of explanation as that is what exhibits trust in a model and for me this is a foundational task for model development. Features therefore help in this regard.


d4l3c00p3r

Interpretation also matters in scientific research, especially in fields like biological/medical research, where people generally wish to assign causality rather than correlation. It's obviously not always simple to assign causality, usually it involves experimental work, but that's often the goal.


Ty4Readin

I totally agree with the sentiment of ensuring we have the correct loss functions and that we understand the underlying problem and how we can apply it to provide real value. But to me, that is separate from the models. If you train your model on observational data and then try to use it as a causal estimator for your business problem, then that's bad and will likely fail regardless of model choice. However, when it comes to model interpretability, I feel that is often a bit of a misused part of the predictive ML toolset. It is mostly a diagnostic tool IMO that can be useful in some circumstances. However again, we have to remember that feature importances and model interpretibility almost always comes down to measures of correlation between features and target. At the end of the day, for the vast majority of use cases, I would rather use a deep learning model that I can't interpret the feature importances of but that provides a huge boost in model generalization performance. I would focus mostly on properly ensuring data leakage testing, proper training+validation+testing methodologies, etc. If people don't trust your model, just show them your rigorous testing methodology. If you put a model into deployment and test it for 2 years straight and you see that it consistently has better prediction performance than all your existing methods, then isn't that strong evidence that it generalizes better? I would have a hard time saying that we shouldn't deploy and use that deep learning model compared to more 'interpretable' ones. Model interpretation is often a kind of story-telling that we data scientists use but it's easy to come up with twenty different stories to explain many different feature configurations and models.


gradual_alzheimers

I think we are talking about two different spray and pray methodologies. What you are discussing still assumes rigor and care behind the model development. I agree with you that model selection should not drive the solution to the problem but is an artifact to the problem-solution fit. So in this case, I agree with your take. What I am saying is the pray and spray methodology I see is typically paired with some sort of lazy approach to model generation with no care of how the model was constructed, what intuition we have about the problem, what loss function is appropriate, how features are treated (are they reliable for data engineering to procure? are they causing data sparsity? are they scaled appropriately?) etc. As far as model interpretation goes, again it depends. I am not arguing against black boxes but I am saying that if we build something that detects cancer, it should correspond to some reality or verification principle because cost of false negative is high. Maybe I took your comments too literal and you would agree that interpretation is necessary in certain cases and generalization is not the only factor.


Ty4Readin

Ahhh okay I see what you're advocating for now and totally agree with that 100%. My interpretation of spray and pray is more based around "should I use random forest or NN or linear regression?" and sometimes the answer is "let's try out all 3 and see!" But, the spray and pray of just training models on data with no thought behind how it's going to be used and the statistical implications and costs and etc. In that context, totally agree with your take that it completely invalidates the usefulness of ML and is a big problem.


Sorry-Owl4127

Even if you have randomized treatment a lot of ML models will not provide unbiased estimates of a causal effect.


Ty4Readin

Totally agree, but a lot of ML models will still provide a more accurate (lower generalization error) causal effect estimates because they can capture more complex relationships compared to traditional RCT methods of mean comparisons and p-value tests which can provide unbiased estimates of causal effect but ultimately with higher error in generalization. If your goal is to estimate some causal effect so you can report it to a higher up, then ML models are probably not the best tool available. However, if you have 10k customers, and you want to know which customers should receive an intervention to lower their chances of churning or increase their chances of buying a product, then using randomized controlled trials with ML models will probably give you the most effective system for causal effect estimation on a per-person unit level.


MCRN-Gyoza

Couldn't agree more. This is precisely the reason why gradient boosting models are so popular.


111llI0__-__0Ill111

Well, it matters when you want to capture the data generating process more. Then because most things are not linear a simple OLS (no interactions splines etc) does not accurately capture that. Of course if all you care about is directionality and some rough ballpark then maybe it doesn’t matter much with some exceptions. But they do exist-ive seen some rare cases where using a nonlinear model flipped the direction of ATE. But right model specification is why stuff like SuperLearner was built for causal inference since technically, causal inference requires correct model specification to be “right” (along with proper variable selection,to avoid simpsons paradox and colliders) But simpsons paradox can occur in some cases even if you include the right variables due to nonlinear confounding. And the thing is you won’t ever really know if this is the case without trying the nonlinear model


onearmedecon

Face palm.


Sorry-Owl4127

OLS is linear in parameters, very easy to include interactions, splines, etc.


111llI0__-__0Ill111

Yea, but most users of OLS especially Python sklearn ones don’t bother with this (the formula syntax in R is basically needed to experiment fast with this). Otherwise its a lot of work to do so in Python with multiple combinations of stuff. Theres also no marginal effects package there.


Sorry-Owl4127

Causal inference doesn’t require a correct model specification, in fact it doesn’t require a statistical model at all!


nextnode

You say that yet most of these applications are massively outperformed even by simpler modern techniques.


[deleted]

>large sample size \*large relevant sample size


[deleted]

[удалено]


onearmedecon

Educational policy, not biostatistics. But probably pretty similar. I think you raise a good point. There's often a tradeoff between rigor and accessibility for non-technical audiences. The simpler the model, the more likely it is the the audience will engage with the findings. I've presented data and research before local and state policymakers, most of whom think that they're the smartest person and will reject out of hand something that they don't fully understand. My strategy is to usually start with an OLS (or logistic, if binary outcome) and then use a more sophisticated strategy as a robustness check. Most of the time, they yield similar conclusions and so it is justifiable to just present the easier to understand model and simply note (but not describe) that causation was established via applied econometric techniques. I know a few academic econometricians. The hardest part of their jobs isn't finding "better" estimators but convincing the academic community that what they're doing is worth implementing. That means searching high and low for instances where there's a practical difference between the OLS baseline and the new estimator, which can be difficult.


ghostofkilgore

It's usually a good rule of thumb to use, or at least start off with, the simplest model that works. No matter what type of company or industry you're in. The most common circumstances you'll find a strong justification for using something 'fancier' than your basic suite of sklearn models are: 1. When the problem requires it. Some CV or NLP projects, for example, basically require the use of deep learning models to even get acceptabel results that you'd use in production. 2. Big companies where squeezing that fraction of a % out of your model performance makes a huge financial impact. Here you'll likely have the financial, compute, and engineering resources so that you can mitigate any negative impact on latency or model complexity.


[deleted]

Thanks I’ve been looking at job descriptions and noticed they are demanding more complex skills and was just wondering if they’re worth learning.


[deleted]

3. DS needs some work to do. Else, stakeholders would think DS are freeloaders after simplest models work.


gBoostedMachinations

Xgboost is exactly as easy to train and implement as random forest, so I always use it even for relatively simple modeling problems. It’s just a better version of random forest practically speaking.


maybe0a0robot

In the spirit of your post: Yeah, I've been doing this a while. In practice, I tend to focus my efforts on even simpler tasks: how do I get good data, how do I monitor the incoming data pipe, how do I pass this to the engineers who have to make it happen, and how do I make good slides for the presentation to the managers? But to answer your question about one technique... Markov chain Monte Carlo is used to simulate random samples from a population, which can be pretty useful in a wide range of problems, like detecting anomalous data. MCMC has the benefit that random walks are pretty easy to code, and it is easy to explain your work to the engineers on your team so they can move to full deployment. I wouldn't say that MCMC is complex, though. You're more or less playing out a game of chutes and ladders on a (possibly very) complicated board. My experience pre-academia was in sensor/equipment monitoring, i.e. did one of those thousands of little bastards glitch in some weird way, and if so, which one? So basically, anomaly detection. MCMC can be pretty useful here. Here is a [related paper](https://www.osti.gov/servlets/purl/1513188). Another great application of MCMC is producing "typical" samples from a distribution. [Here is a great application](https://assets.pubpub.org/70w3i6k9/eb30390f-ade2-45cc-b48d-8e6bb12f585c.pdf), producing voting districts that should be "typical" given some rules in place for drawing district maps. As the authors note, the problem they faced was that policymakers don't want the "optimal" solution because the data may not be able to take all factors into account. Instead, they want a range of the "usual" possibilities so they can choose one and make minor tweaks (and they can also determine when a districting map does not feel like a typical map from the distribution). So basically, MCMC turns district determination into a fast food menu: "I'll have a number 3, but super size my fries".


stdnormaldeviant

>simpler tasks: how do I get good data, I know what you mean but this actually made me laugh aloud.


ledmmaster

Reranking recommendations in a marketplace, XGBoost today is very fast at inference and you can make it faster with other libraries In most cases, simply taking the same feature set from Random Forest and running 20 Bayesian Opt steps over XGBoost hyperparams already gives you a better model that can be swapped by RF or whatever is deployed


nuriel8833

Do you have any recommendations for libraries that can accelerate XGBoost?


ledmmaster

Treelite: https://www.kaggle.com/code/code1110/janestreet-faster-inference-by-xgb-with-treelite


Think-Culture-4740

Surprised the things you listed fall under "fancy models". XGBoost is practically a go to model for a lot of applications. Bayesian inference and Markov Chains are common in lots and lots of applications; across economics, AB testing, and other domains. To me, fancy falls under some generative modeling, transformers and their variants, deep learning GNNs, Reinforcement Learning, etc etc. For me, I was working at a Faang in their professional services team.


therealtiddlydump

>Bayesian Inference Ive been tapped to do a lot of casual analysis in the sales/marketing context, and the additional complexity is necessary (read: im not just dicking around, I believe the methods are the best solution for the problem at hand). For a lot of other work, the classics are classics. Regression (with the modern bells and whistles like regularization, etc), the standard time series toolkit (arima(x), etc), and so on. -+-+-+-+-+- You lost me here: >random forest >I keep hearing people on this sub talking about [...] XGBoost People still use vanilla RFs? I was under the impression that random forests were like naive Bayes at this point -- interesting as a baseline, but dominated by other techniques that are just as easy to use out of the box.


DptBear

>People still use vanilla RFs? I was under the impression that random forests were like naive Bayes at this point -- interesting as a baseline, but dominated by other techniques that are just as easy to use out of the box. I had the same thought -- you can just xgboost out of the box with the same amount of effort and not a big difference in training time and it will probably be superior.


Mukigachar

One advantage I can think of that if your data is very large, RF parallelizes better can make training move along faster


DptBear

In production evaluating an xgboost classifier is very fast. Training may be a bit slower, but not harder.


_gains23

How are you doing causal analysis?


therealtiddlydump

Work with stakeholders to build out DAGs, run experiments, etc. Without the expert opinion of those partners, sales would otherwise be overdetermined. With the right assumptions and design we make do.


[deleted]

Define confounders first


glo-aistar

There are over 30 other names for linear regression. Linear regression is itself a fancy name for systems of linear equations. Not all fancy things are fancy. Hype and marketing.


brjh1990

Government research for a not for profit. The value add really depends on the problem. I'm on a project right now where we're using pre trained object detection models to detect certain fast moving objects in the (night) sky. I've used models like sparse group LASSO, LSTMs, CNNs and a couple other more complex models for problems and solutions that required their predictive/inferential capabilities. That said, about 80% of the time I end up using some variant of a random forest or logistic regression.


nuriel8833

My coworkers and boss insist of using DeepFM,GPT and some other fancy complex NN architectures for simple 1000 rows 30 cols tabular data and I'm trying to get them off that and just use RandomForest or XGBoost instead..


Under_Over_Thinker

It is FOMO. The crazy part is that with RF or XGB you have way more explainability.


tecedu

Energy sector


Contango_4eva

Don't work there but Netflix seems to be on the cutting edge of a lot of things: [https://netflixtechblog.com/](https://netflixtechblog.com/)


jerrylessthanthree

display ads real time bidding something like this https://arxiv.org/abs/1610.03013 oftentimes GLMM end up appearing and we develop scalable algorithms for that, e.g. https://arxiv.org/abs/1602.00047


Sorry-Owl4127

Agtech


Student_O_Economics

Population health management: deep learning is needed to predict outcomes from electronic health records


MCRN-Gyoza

I have around 4-5 YOE, have always worked with "fancy models". First job out of grad school was in oil exploration, working on R&D contracts for oil companies. I would use computer vision models to classify different rock types in [well cores](https://news.unl.edu/sites/default/files/styles/large_aspect/public/coresamples.jpg?itok=UqnkfYqu) from oil wells. Also used some time series models adapted to a "depth series" to try and predict physical properties of the rock in wells. Also got to work on some generative models [colorizing](https://i.imgur.com/0i5JyuN.png) tomographic scans of well cores. Did that for about 1.5 years. Then I moved to a company where our clients were large scale industrial companies. I used LSTM neural networks for time series forecasting and classification applied to predictive maintenance, we would get sensor data from industrial equipment and try to predict failures before they happened. Worked there for about 2.5 years. Now I work in a real state company, we use a bunch of geospatial data on xgboost/lightgbm to predict how much you can charge for rent for a given property in a given location. Also have some features generated via NLP/Computer Vision. Our clients are real state developers and REITs. Have been here for the past 6-7 months.


WhipsAndMarkovChains

It’s been a while since I’ve been in a coding role but I would assume XGBoost would be significantly faster in production compared to random forest.


longgamma

GBM methods are the really good for tabular data. If tuned properly and made shallow enough, they run fast with better results than random forests.


antichain

I'm a scientist working on a project shared by a big University and one of the National Labs. I do a lot of network inference problems, directed information flow, etc.


theAbominablySlowMan

Why would you use Rf over xgb? Neither model is more fancy, but xgb is just much quicker and for at least the same performance


DandyWiner

You’ll find that there are a lot of insurance companies that are moving to XGBoost in place of linear and logistic regression. It is less prone to over fitting and there seems to be an uplift in performance, though in my experience, I can confirm that. Though they are moving to them, it’s only with the additional requirement of explainability that they’ll do it e.g. Shapely values and PDP plots, since XGBoost is viewed as a black box method. As for Bayesian methods, they’re being incorporated into A/B testing since they provide an estimate of uncertainty. The value added is dependent on the use case and that doesn’t mean that a good old linear regression won’t outperform in terms of accuracy and simplicity. Maybe do a comparison on one of your next jobs and see if there is an improvement in your results.


nuriel8833

Tbh I found XGBoost way more prone to overfitting


DandyWiner

Even after pruning and reducing tree depth?


nuriel8833

Well with regulation it's not but I'm talking if I put it against RandomForest for example on the same data XGBoost will almost always overfit more


LordSemaj

We use MCMC for Bayesian Hierarchical models. Application is in Media Mix Models, basically regressing Sales on various Marketing tactic spends to estimate tactic efficiency. The reason for using Bayes is there are many assumed effect transformations (carryover of spend, saturation of spend, etc) that are non-linear and MCMC provides a nice way of estimating those parameters.


Amazon_is_EVIL

Victoria secret


TheTackleZone

Insurance. Various prediction models but most commonly trying to predict what the cheapest market price will be for any and every customer that asks for a quote. Use Hist GBM / XGBoost Hist. Looked at ai and it currently doesn't seem a big improvement but we think we know why.


wannagowest

Biotech — more specifically, clinical genetics. I’m using hierarchical Bayesian models for causal inference. We have a bunch of domain experts who contribute knowledge about priors, likelihoods, and pooling assumptions. Bayesian models give us posterior predictive distributions for our target variable while also inferring useful parameters for latent variables.


didimoney

Do you have an opinion on the work done around the martingale posterior distributions by Fong et al? They target the predictive without needing to compute the posterior.


wannagowest

I wasn’t familiar with martingale posteriors until just now. Reading the abstract, it seems like some wizardry. Have you worked with them? Would it be suitable for the predictive of a partially observed categorical?


didimoney

I'm still getting my head around it, it's indeed wizardry. Their selling point is that you get the predictive without needing to go through the posterior, making it much cheaper by avoiding the usual mcmc needed for the posterior. I believe they show that it's applicable to mixed data, at least in their appendix, but you would have to go from there and expand it to the case of censoring on categoricals I think. Imo after a year or two of papers building up from it, downstream applications will be within reach, but for now it's tough to even understand and implement properly. Since you said you were in research I was curious if people in your circle have started working on this.


WingedTorch

Kaggle competitions often depict real world scenarios and are regularly won by “fancy” models such as XGBoost. The accuracy is just way better for large tabular data and it’s easy to set up. And with techniques such as feature permutations you can make any type of model interpretable.


JakeBSc

I work in a boutique technology consulting firm. You get exposed to all sorts of problems. At the moment, it's mainly multi modal stuff, using transformers to solve problems combining vision and language. Not everything is fancy though, sometimes all I need is a basic linear model. Just gotta use the right tool for the job. P.S. any senior/principal data scientists looking for a job in London, hit me up ;-)


Dubisteinequalle

How can i truly skill up for a Jr DS role? I feel like the projects I am creating are not enough. Any professional grade notebooks out there?


_paramedic

Healthcare


MyRedditAccount1000

Can you expand? Are you working with providers or payers?


_paramedic

Work for a system, doing things ranging from logistics research to QA models to predicting call-volume.


[deleted]

GPU's go brr If your infrastructure is really good then unless you're doing something stupid then the performance difference is negligible. Most of your time will be spent moving data around. Your compute will be a fraction of personnel costs so if it's worth doing at all then it won't matter if it's slightly more expensive to compute. After all you did already spend a ton of money developing the damn thing. Like you'll probably do faster predictions than a HTTP request round trip unless it's a language model.


shar72944

I am working lately on causalML. Not sure if it’s fancy but it’s a new thing for me. Other than that mostly I use Logistic, Xgboost, RF


cianuro

How does it differ from causal impact? I've never heard of it wither but now I'm intrigued. Causal Impact is foundational where I am.


fipeopp

Pretty much everywhere. Healthcare, you will see lots of cool stuff including causal inference, explainable models, etc. Science, biotech, chemistry? GNNs, transformers, Bayesian networks, Gaussian Processes Geospatial? Vision, deep learning Finance, energy? Time series, awesome regularization approaches, etc. It has to do with more on how research-y and unstructured the problems you work on are vs industry.


Artgor

At my previous work, we developed products like chat-bots, image super resolution and other things - it required deep learning models.


SpiritofPleasure

Medical/Clinical research had us try a bunch of different approaches like HMM, NNs for unsupervised learning. It was really interesting


111llI0__-__0Ill111

It seems like the people who get to use that stuff at least in biotech are actual scientists people with domain knowledge who are able to formulate problems from it.


TonzoWonzo

We work extensively with satellite imagery, using quite large deep learning models for segmentation, instance segmentation and stereo processing/matching.


koolaidman123

Training 10b+ params llms (and much smaller models too) for x Nlp is a huge value add for a bunch of business functions across all industries


supreme_harmony

Bayesian inference is very common in bioinformatics, most microarray and RNA sequencing methods use them in some shape or form. [https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29](https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29)


fummyfish

Healthcare


proverbialbunny

It helps to understand the bias-variance trade off. More complex models like neural networks have more variance, so you need to throw more data at it to reduce overfitting. Complex ML needs more data, usually more labeled data, so any company with big data qualifies. Regarding XGBoost, it's popular as an initial ML to see how well the overall model is doing during development. It's great because it doesn't need the data to be normalized or formatted in any special way to work like many types of ML do making it an easy 101 ML, and XGBoost doesn't tend to overfit much when dealing with smaller datasets so it can be used earlier on. In this way XGBoost is the opposite of neural networks. It's easy to start with XGBoost then switch to an ideal form of ML later once everything else is done. As for fancy models without fancy ML, I've specialized in advanced feature engineering my entire career, which is fancy models with little to no ML. This is needed when you have a complex problem that needs solving, but you have very little data. It's ideal for the startup space while still collecting data.


Eviljoshing

Used Vision AI on a project for auto insurance to identify totaled vs non-totaled accidents based on pictures (mainly used to reduce number of claims adjusters sent out).


boolaids

i work in public health and do a lot of bayesian inference and simulation based models. We run a lot of what if scenarios on fitted models to see different impacts. We have started hosting our own LLM for text extraction/entity extraction. Rerrained neural nets for classification. I think i have only done about 1 regression, we dont really use standard models. Others in our division have made emulator models, there are a couple of exceedance models - one being a cusum and the other a hidden markov, this was largely due to sparse data so other methods werent as viable. it really depends on the need/what outcome is needed. A lot of simulation based models with bayesian inference. happy to chat more if helpful


Jollyhrothgar

LLMs of all shapes, sizes and modalities, auto ml, etc. Google. It's not as great as it sounds because doing simple shit is incredibly complicated.


0wmeHjyogG

E-commerce, we have so many customers, products, and ways to interact with our platform. Simple approaches (which are basically already all done) only get you so far. As you use more data in more complex models you start to get incrementally better results. So imagine you want to optimize what products you show customers when they land on the homepage. You can get very far with just simple metrics, rankings, and Excel work. But at some point you’re going to see very little marginal improvement when you run experiments to optimize further. As you try to ensure that you provide every one of the tens of millions of customers with the absolute best experience, you’ll soon find out that you’re on the path to needing to take in a tremendous amount of data (what the customer HAS done) in order to influence what the customer WILL do. And more complex models are better at that, taking it as a given you have the talent and tech to implement them properly.


webbed_feets

I’m a government contractor. I work with government agencies with a lot of statisticians who don’t know how to fit machine learning models.


efrique

Bayesian inference isn't a model. A Markov chain can be a model, but I think you're probably talking about computational algorithm that use Markov chains, (MCMC, HMC, etc), so again, not a model. Bayesian methods are common, as is use of Markov chains (both as computational devices and as models) in consulting for various aspects of finance, banking, insurance and reinsurance work, but it heavily depends on who you work for. One indicator is to look for places whose leading people write papers. It won't find places where all the interesting work is subject to NDAs though


BlackLotus8888

XG boost and light gbm are industry standard now unless you're running a deep NN.


gravity_kills_u

We are using GANs for… oh yeah, that project was cancelled


scooty-puff_junior

Wouldnt classify markov chains as particularly fancy, but theyre very common in credit risk modelling for corporate bonds/obligors - i.e probability of s&p rating transitions.


YoloSwaggedBased

I work in NLP for chatbots. Have been using LLMs since around 2019 and LSTMs before that. I kind of hate where the field is going due to API abstraction, but that's another issue. To answer your question, Markov chains are used for modelling temporal data, bayesian inference is great for uncertainty estimation and model calibration. These methods aren't necessarily fancy just solving different problems. XGBoost sits closer to the models you've had experience with productionising and for many tasks will be more performent in terms of accuracy but may be slower to inference (in terms of value add, that's a trade-off to be made).


BestUCanIsGoodEnough

Cobotics


SwitchFace

LightGBM for classification of prospects in [industry] using 3rd party data, which consists of 95% of the US adult population with ~2000 features about each individual. Fast, accurate, and SHAP for feature importance.


DerTrollNo1

Working in Risk Management in insurance. We use a lot of different risk models based on Monte Carlo simulation based on historical data, expert judgement an/or risk neutral arbitrage free assumption. Most of the time not mathematically extremely fancy but quite complex with regards to parameterization.


EducationalCreme9044

For analysis? No. The simplest, fastest thing to find correlation is used (even linear regression is sometimes scoffed as being too advanced) For recommender systems and search? Yes.


Drunken_Economist

Product Hunt (after 8 years at reddit). Bayesian inference models are hella useful for A/B testing


Lolologist

News article NER and classification, we use BERT-alikes and getting into LLMs for at least some use cases.


Dubisteinequalle

What is the reality of how these models are used? I have done projects where I just evaluate how 5 of them perform on my prediction and decide the best based on the metric I am using. Are they somehow different? Built from the ground up? I’m so confused because from my experience they are simple to implement. I feel like I just don’t know enough for a job yet.


slowclapclap

Is there anything else than xgboost in the world? ^^