[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?



Kevin,

I just rebooted it.

Ryan

address@hidden> wrote:

> New Client Reply: AWS THREDDS NEXRAD server down?
>
> Hi Ryan,
>
>
> I suspect the AWS server is in need of another restart ... I'm getting
> reports from my students of problems again, and I am seeing the same.
>
>
> Tomorrow is the last of the student presentations for my class, so I think
> this problem will go away for now, but definitely it's something to look
> further into for the future.
>
>
> Thanks for checking,
>
>
> Kevin
>
>
> _____________________________________________
> Kevin Tyle, Manager of Departmental Computing
> Dept. of Atmospheric & Environmental Sciences
> University at Albany
> Earth Science 235, 1400 Washington Avenue
> Albany, NY 12222
> Email: address@hidden
> Phone: 518-442-4578
> _____________________________________________
> ________________________________
> From: Tyle, Kevin R
> Sent: Thursday, May 4, 2017 5:05:25 PM
> To: Ryan May
> Cc: address@hidden
> Subject: RE: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
>
> Ok thanks!  Projects should be done soon, so hopefully the load will ebb
> in time for you to figure out what’s causing the leakage.
>
> _____________________________________________
> Kevin Tyle, Manager of Departmental Computing
> Dept. of Atmospheric & Environmental Sciences
> University at Albany
> Earth Science 235, 1400 Washington Avenue
> Albany, NY 12222
> Email: address@hidden<mailto:address@hidden>
> Phone: 518-442-4578
> _____________________________________________
>
> From: Ryan May [mailto:address@hidden]
> Sent: Thursday, May 04, 2017 5:04 PM
> To: Tyle, Kevin R <address@hidden>
> Cc: address@hidden
> Subject: Re: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
>
> Kevin,
>
> I think the TDS is leaking file handles. I rebooted the machine and that
> seems to have restored the space.
>
> I need to dig in more to understand why it's leaking, but you should be
> able to use it more reliably now.
>
> Ryan
>
> On Thu, May 4, 2017 at 2:59 PM, Ryan May <address@hidden<mailto:rmay@
> ucar.edu>> wrote:
> I see the 500's, it looks like the disk is filling up (small VM), but not
> sure why--it's not the usual suspect. I don't expect now that open dap will
> make a difference.
>
> I'm digging in now and will let you know when I have it fixed.
>
> Ryan
>
> On Thu, May 4, 2017 at 2:54 PM, Tyle, Kevin R <address@hidden<mailto:
> address@hidden>> wrote:
> It’s not a 404 … it’s a 500 server error:
>
> Meanwhile, I’ll sub in Opendap for CDMremote and see what happens .
>
> Traceback (most recent call last):
>   File "<ipython-input-9-721f2a49ad41>", line 13, in <module>
>     data = Dataset(ds.access_urls['CdmRemote'])
>   File 
> "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/cdmr/dataset.py",
> line 120, in __init__
>     self._read_header()
>   File 
> "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/cdmr/dataset.py",
> line 123, in _read_header
>     messages = self.cdmr.fetch_header()
>   File 
> "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/cdmr/cdmremote.py",
> line 31, in fetch_header
>     return self._fetch(self.query().add_query_parameter(req='header'))
>   File 
> "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/cdmr/cdmremote.py",
> line 16, in _fetch
>     return read_ncstream_messages(BytesIO(self.get_query(query).content))
>   File "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/http_util.py",
> line 375, in get_query
>     return self.get(url, query)
>   File "/linuxapps/anaconda/lib/python2.7/site-packages/siphon/http_util.py",
> line 459, in get
>     text))
> HTTPError: Error accessing http://thredds-aws.unidata.
> ucar.edu/thredds/cdmremote/nexrad/level2/S3/2017/01/16/
> KCLE/KCLE20170116_124911_V06?req=header: 500 Internal Server Error
>
>
> _____________________________________________
> Kevin Tyle, Manager of Departmental Computing
> Dept. of Atmospheric & Environmental Sciences
> University at Albany
> Earth Science 235, 1400 Washington Avenue
> Albany, NY 12222
> Email: address@hidden<mailto:address@hidden>
> Phone: 518-442-4578<tel:(518)%20442-4578>
> _____________________________________________
>
> From: Ryan May [mailto:address@hidden<mailto:address@hidden>]
> Sent: Thursday, May 04, 2017 4:49 PM
> To: Tyle, Kevin R <address@hidden<mailto:address@hidden>>
> Cc: address@hidden<mailto:support-thredds@
> unidata.ucar.edu>
> Subject: Re: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
>
> Kevin,
>
> That's really strange. I see a lot of activity coming from
> reed.atmos.albany.edu<http://reed.atmos.albany.edu>, but I don't see any
> 404 errors. Can you try using opendap instead of CDMRemote?
>
> use:
>
> from netCDF4 import Dataset
>
> and:
>
> data = Dataset(ds.access_urls['OPENDAP'])
>
>
> Ryan
>
>
> On Thu, May 4, 2017 at 2:22 PM, Tyle, Kevin R <address@hidden<mailto:
> address@hidden>> wrote:
> Hi Ryan,
>
> Reviving this "thredd" today ...
>
> My students and I are having lots of issues today with connections to the
> AWS Level 2 thredds server.
>
> I've rewritten things so the code traps 404 errors, but it seems that once
> we catch one, it's no go on any further connection attempts until/unless we
> restart kernel and try again after a few minutes.
>
> Any ideas?
>
> Thanks,
>
> Kevin
>
> ---------------------------
>
> Code snippet:
>
> ------------------------------------------------------------
> ---------------------------------------------------------
> meshes = []
> for item in sorted(cat.datasets.items()):
>     # After looping over the list of sorted datasets, pull the actual
> Dataset object out
>     # of our list of items and access over CDMRemote
>     ds = item[1]
>     print (ds)
>     process = 0
>     ok = 0
>     while ok == 0:
>       ncount = 1
>       while ncount <= 5:
>         try:
>           data = Dataset(ds.access_urls['CdmRemote'])
>         except:
>           print "Caught an exception; will try URL again in 5 secs"
>           sys.stdout.flush()
>           time.sleep (5)
>           ncount = ncount + 1
>         else:
>           ok = 1
>           ncount = 10
>       if (ncount == 6):
>         print "Exceeded # of attempts to connect to remote server"
>         sys.stdout.flush()
>         process = 0
>         ok = 1
>       else:
>         print "Loaded dataset successfully"
>         process = 1
>
>     if process:
>     # Pull out the data of interest
>       sweep = 0
>       rng = data.variables['distanceR_HI'][:]
>       az = data.variables['azimuthR_HI'][sweep]
>       ref_var = data.variables['Reflectivity_HI']
>
>     # Convert data to float and coordinates to Cartesian
>       ref = raw_to_masked_float(ref_var, ref_var[sweep])
>       x, y = polar_to_cartesian(az, rng)
>
>     # Plot the data and the timestamp
>       mesh = ax.pcolormesh(x, y, ref, cmap=ref_cmap, norm=ref_norm,
> zorder=0)
>       text = ax.text(0.65, 0.03, data.time_coverage_start,
> transform=ax.transAxes,
>                    fontdict={'size':16})
>
>     # Collect the things we've plotted so we can animate
>       meshes.append((mesh, text))
> print ('done')
>
> ------------------------------------------------------------
> ------------------------------------------------
> Output:
>
> ------------------------------------------------------------
> ------------------------------------------------
> <siphon.catalog.Dataset object at 0x7f1bf57d6c50>
> Loaded dataset successfully
> <siphon.catalog.Dataset object at 0x7f1bf57d6cd0>
> Loaded dataset successfully
> <siphon.catalog.Dataset object at 0x7f1bf57d6c90>
> Loaded dataset successfully
> <siphon.catalog.Dataset object at 0x7f1bf57d6d10>
> Loaded dataset successfully
> <siphon.catalog.Dataset object at 0x7f1bf57d6d50>
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Exceeded # of attempts to connect to remote server
> <siphon.catalog.Dataset object at 0x7f1bf57d6d90>
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Exceeded # of attempts to connect to remote server
> <siphon.catalog.Dataset object at 0x7f1bf57d6dd0>
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Exceeded # of attempts to connect to remote server
> <siphon.catalog.Dataset object at 0x7f1bf5836e90>
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Caught an exception; will try URL again in 5 secs
> Exceeded # of attempts to connect to remote server
> done
>
> ------------------------------------------------------------
> -----------------------------------------------
> _____________________________________________
> Kevin Tyle, Manager of Departmental Computing
> Dept. of Atmospheric & Environmental Sciences
> University at Albany
> Earth Science 235, 1400 Washington Avenue
> Albany, NY 12222
> Email: address@hidden<mailto:address@hidden>
> Phone: 518-442-4578<tel:518-442-4578>
> _____________________________________________
>
>
> -----Original Message-----
> From: Tyle, Kevin R
> Sent: Thursday, April 20, 2017 11:15 AM
> To: 'address@hidden<mailto:support-thredds@
> unidata.ucar.edu>' <address@hidden<mailto:
> address@hidden>>
> Subject: RE: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
>
> Hi Ryan,
>
> I'm now running into a sporadic error with the following line in your demo
> notebook:
>
> data = Dataset(ds.access_urls['CdmRemote'])
>
> I get the following error:
>
> HTTPError: Error accessing http://thredds-aws.unidata.
> ucar.edu/thredds/cdmremote/nexrad/level2/S3/2017/03/14/
> KENX/KENX20170314_160423_V06?req=header: 404 Not Found
>
> Any ideas?
>
> Thanks,
>
> Kevin
> _____________________________________________
> Kevin Tyle, Manager of Departmental Computing Dept. of Atmospheric &
> Environmental Sciences University at Albany Earth Science 235, 1400
> Washington Avenue Albany, NY 12222
> Email: address@hidden<mailto:address@hidden>
> Phone: 518-442-4578<tel:518-442-4578> ______________________________
> _______________
>
>
> -----Original Message-----
> From: Unidata THREDDS Support [mailto:address@hidden
> <mailto:address@hidden>]
> Sent: Tuesday, April 18, 2017 11:52 AM
> To: Tyle, Kevin R <address@hidden<mailto:address@hidden>>
> Cc: address@hidden<mailto:support-thredds@
> unidata.ucar.edu>
> Subject: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
>
> Kevin,
>
> Actually, I was in contact from Illinois yesterday who was doing some
> statistics--looked like against KBUF, getting a new volume every couple of
> minutes. Nothing crazy, but sustained enough use to bring down the machine.
>
> Anyhow, the problem was that the memory on the machine was filling up (the
> micros only have 1GB). I'm hoping an extra GB of memory can solve the
> problem. I'm already doing some pretty aggressive cleaning of the cache, so
> disk isn't an issue.
>
> If you give me a heads up on days you're expecting heavy use, I can keep a
> better watch on it. I'm pretty confident the bigger machine will help,
> though.  I'm also happy to bump up the instance to something even beefier
> for a couple days if necessary.
>
> Ryan
>
> > Hi Ryan,
> >
> > I wonder if that "someone" could have been me last night, as I was doing
> things like changing the site from Louisville for the single, most recent
> image, as well as changing the date and location for the animation.
> Although, it was probably only two times that I ran it before things went
> kablooey.  I'm intending for our ATM350 students to use the server for
> their case study presentation that concludes the class in a couple weeks,
> so it would be good to have this quasi-stable, although of course I will
> have some backup plans.
> >
> > I found that on the THREDDS server I'm running on AWS for big weather
> web, I had to greatly reduce caching in order to keep the root file system
> from filling up, in case that was what the issue was here.
> >
> > I'll do some more testing today to see how the larger instance handles
> things!
> >
> > Cheers,
> >
> > Kevin
> >
> > _____________________________________________
> > Kevin Tyle, Manager of Departmental Computing Dept. of Atmospheric &
> > Environmental Sciences University at Albany Earth Science 235, 1400
> > Washington Avenue Albany, NY 12222
> > Email: address@hidden<mailto:address@hidden>
> > Phone: 518-442-4578<tel:518-442-4578> ______________________________
> _______________
> >
> >
> > -----Original Message-----
> > From: Unidata THREDDS Support
> > [mailto:address@hidden<mailto:suppor
> address@hidden>]
> > Sent: Tuesday, April 18, 2017 11:36 AM
> > To: Tyle, Kevin R <address@hidden<mailto:address@hidden>>
> > Cc: address@hidden<mailto:support-thredds@
> unidata.ucar.edu>
> > Subject: [THREDDS #TWN-232939]: AWS THREDDS NEXRAD server down?
> >
> > Kevin,
> >
> > Someone else has been utilizing the server a lot, and it seems to be
> overwhelming the limited memory of the micro AWS instance we were hosting
> it on. I've bumped up the machine now to a bigger one now (all of 2GB of
> RAM!).
> >
> > Please feel free to beat on this machine so we can see if that solves
> the problem.
> >
> > Ryan
> >
> > > Hi folks,
> > >
> > > Last night I was using the THREDDS_Radar_Server_AWS Jupyter notebook
> to interact with the Thredds server you are running on AWS to serve the
> Nexrad 2 archive.  While the first couple of attempts worked fine, I
> noticed that the server stopped responding, and it looks to still be down
> this morning:
> > >
> > > ConnectionError:
> > > HTTPConnectionPool(host='thredds-aws.unidata.ucar.edu<http:/
> /thredds-aws.unidata.ucar.edu>', port=80):
> > > Max retries exceeded with url:
> > > /thredds/radarServer/nexrad/level2/S3/dataset.xml (Caused by
> > > NewConnectionError('<requests.packages.urllib3.connection.HT<
> http://packages.urllib3.connection.HT>TPConnec
> > > tion object at 0x2b70cd146750>: Failed to establish a new
> > > connection: [Errno 111] Connection refused',))
> > >
> > >
> > >
> > > Can you take a look and advise, please?
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: TWN-232939
> > Department: Support THREDDS
> > Priority: Normal
> > Status: Closed
> > ===================
> > NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publicly available through
> the web.  If you do not want to have your interactions made available in
> this way, you must let us know in each email you send to us.
> >
> >
> >
>
> Ticket Details
> ===================
> Ticket ID: TWN-232939
> Department: Support THREDDS
> Priority: Normal
> Status: Closed
> ===================
> NOTE: All email exchanges with Unidata User Support are recorded in the
> Unidata inquiry tracking system and then made publicly available through
> the web.  If you do not want to have your interactions made available in
> this way, you must let us know in each email you send to us.
>
>
>
> --
> Ryan May, Ph.D.
> Software Engineer
> UCAR/Unidata
> Boulder, CO
>
>
>
> --
> Ryan May, Ph.D.
> Software Engineer
> UCAR/Unidata
> Boulder, CO
>
>
>
> --
> Ryan May, Ph.D.
> Software Engineer
> UCAR/Unidata
> Boulder, CO
>
>
>
> Ticket Details
> ===================
> Ticket ID: TWN-232939
> Department: Support THREDDS
> Priority: Critical
> Status: Open
> Link:  https://andy.unidata.ucar.edu/esupport/staff/index.php?_m=
> tickets&_a=viewticket&ticketid=28176
>
>


-- 
Ryan May, Ph.D.
Software Engineer
UCAR/Unidata
Boulder, CO



Ticket Details
===================
Ticket ID: TWN-232939
Department: Support THREDDS
Priority: Critical
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.