MATLAB Answers

Andy
0

PC Reccomendations for MATLAB

Asked by Andy
on 26 Sep 2012
Hi Everyone!
I have been asked to produce a business case by my employer to detail running costs of using MATLAB or it's alternatives as our main analysis software.
We have approximately 9 million rows of data that have 10 points in each and the aim is to correlate this data on certain parameters. The amount of data will increase by a minimum of 500,000 rows a year and we don't want to have to buy new computers each year.
I know this isn't a question about MATLAB code but they will come in the future!

  0 Comments

Sign in to comment.

Tags

Products

4 Answers

Answer by Jason Ross
on 26 Sep 2012
 Accepted Answer

I've needed to put together quite a few business cases in my time. A few red flags pop up immediately that need to be clarified before you start looking at spec sheets and dreaming about cores, SSDs and memory sticks.
  • What is the environment you are using your machines in? An office environment? Are you in the field, plugged into some piece of equipment? Must the machine be portable? Ruggedized?
  • Do you need to also include some sort of storage device to hold data for a period of time (file server, NAS). Do you need to back this data up?
  • Have you been given a budget? What do you need to pay for with this? Hardware, software, maintenance, electricity, cooling, rack space, IT time, etc?
  • You mention in Jan's answer that this is "real-time". Does this imply that others (meaning humans or other electronic processes) need to consume it? Do you need to post these hourly results on a server somewhere? Does that need to be included in your business case?
  • When you say hourly, are those business hours or 24 hours a day/7 days a week as a fully automated system with paging if it goes down? That can certainly affect how you do your work, and the cost.
  • What is the timeline for hardware replacement? A lot of places will depreciate it over 3-4 years as an interval for upgrades. If you are running the above 24/7 setup, you'll likely want to look at server grade hardware with a 4 hour replacement contract, in addition to methods that allow you to fail over or used a clustered system to make sure the analysis keeps going. These costs are non-trivial -- but your requirements may dictate that they are needed if downtime of this system is measured in $K/min.

  4 Comments

Show 1 older comment
Thanks for the clarification. For an application deployment like that, I'd make sure that in addition to the hardware/software costs, you also figure out:
  • Who is going to be on call? It can't be one person, since there will be vacations, sickness, doctor appointments, weddings, funerals, birth of children and so on that will make one person unavailable to service the application when it goes down -- and logging into the office at calling hours, in the middle of a wedding, from the beach or during childbirth is most definitely not acceptable!
  • Determine right now the desired response time to get things up and running again when it goes down. Is this something that will page at 2 AM and get you out of bed to fix it ASAP, or is business hour support sufficient? Do you have personnel in remote offices who can watch things while you are out?
  • Document, document, document. At the very least you should have a written procedure that people know how to find that details how to restart the system and software. This procedure should be rehearsed a few times without you in the room by the people who will be responsible for it to make sure it's clear. Ideally there should also be instructions for completely rebuilding things from the ground up if the machine you are using gets zorched.
  • I would also recommend not using a desktop machine, and getting this into the server room if possible. Office environments have coffee, cleaning staff, chairs moving around, etc. Server rooms are generally more controlled and less prone to accidents. Also, the server room may be on battery backup so you can live through power outages and other disruptions -- offices, not so much. Also, you don't want some bit of software you install to take down the application -- and you definitely want to control when the machine reboots, too.
  • I could go on more, but the bottom line is that an application like this needs some care in the software design as well as in the support side to be able to run well. Think about all the things that can happen in five years, then have other people review your list and make sure you didn't miss anything -- and then come up with a plan to deal with those things.
  • Make sure you pick a hardware vendor that can supply parts and service for five years. You might also consider having spare parts on-hand if you run on real hardware. You could also possibly look into virtualization, which provides quite a few positives that ameliorate the issues of dealing with hardware (or, more accurately, make it the problem of the people running the server with the VMs on it)
I've seen a lot of problems with the "it runs on my desktop but the department becomes dependent on it" applications, just trying to save you some hassles down the road :)
Some things on here I would never have thought of! Thanks Jason!
No problem, I've had to live (or be witness to) some degree of many of the questions above ... get the answers now, it's easier.

Sign in to comment.


Answer by Jan
on 26 Sep 2012
Edited by Jan
on 26 Sep 2012

What exactly is the question?
You will be able to solve this on a netbook also as long as you can wait until the disk-swapping is ready. A i3 to i7 or Xeon will be much faster, but according to my personal experiences, investing in more RAM is more efficient than buying the hottest processor. When the problem can be parallelized, better buy more cores with less speed.
As usual in the hardware discussions I want to stress, that excellent hardware can accelerate by a factor 5 or 10, while an excellent programmer can gain a speedup of factor 100 under some conditions.
While I've seen several benchmarks which show, that Matlab runs significantly faster under Linux, my impression from this forum is, that Matlab is most stable under Windows, while there are some strange bugs under OS-X and Linux. If you spend too much time with fixing Java-updates or side-effects from the built-in backup mechanism, 10% faster computations are not useful.

  5 Comments

I did 9e9*10*8, which is clearly wrong. A computer to deal with a 1 GB dataset is very different from one to deal with a 1 TB dataset.
"correlate on certain parameters" means that I hope to be able to view the data changing hourly and compare it to historic data. Also I suppose I should say that the data is collected hourly and the plan is to be able to have "real time" analysis of the data.
Jan
on 26 Sep 2012
"Real time" and "hourly" data mean, that the calculations should be finished after less than an hour. Or does "real time" mean, that you want the processing happen inside the limit of 5 seconds, which is a psychological limit in which a human accepts a result as direct reaction to an input?

Sign in to comment.


Answer by Image Analyst
on 26 Sep 2012

If you have only a GB of data or less, that does not seem very demanding. I think a garden variety notebook would handle that no problem. I got a home computer with an Intel I5, 8GB of RAM, and a terabyte hard drive last December for $499 (way, way less than MATLAB costs). That computer would easily breeze through your data. If you're doing lots and lots of iterations on that data set (like some sort of iterative simulation), of course it would take longer than a single pass.
It looks like your boss is concerned about the costs so luckily a fairly mundane computer would handle your data quite easily. Of course if you want more computing power, and have more money, you can get screaming fast computers here: http://liquidnitrogenoverclocking.com/

  4 Comments

Show 1 older comment
True. And it's best if you can have both.
Jan
on 26 Sep 2012
A 5GHz machine increases the energy costs and the CO2 production, while a good programmer reduces both.
I dunno... I can be pretty hot-headed :)

Sign in to comment.


Answer by Richard Crozier on 4 Oct 2012
Edited by Richard Crozier on 4 Oct 2012

Use an Amazon EC2 cloud machine. It will be upgraded regularly and you can later use the distributed computing toolbox to spread the load to several computers. You will also only need to rent the actual computing time you need instead of having a large fixed cost depreciating asset.
You can also upgrade the machine's RAM etc using the same machine image, so no migrating everything to a new PC when it's no longer powerful enough. If your company would like assistance doing this, I'm open to offers ;-)
Oh, yes, and if it falls over, you can restart in under a minute with a new machine node, with the same machine image. You will not require tech support to maintain the machine.

  0 Comments

Sign in to comment.